Reply LinkBack Thread Tools Search this Thread
Old 30-03-2005, 09:26   #1 (permalink)
mattih5
Registered User
 
mattih5's Avatar
 
Join Date: Mar 2005
Location: Manchester
Posts: 62
PHP - Finding the occurences of a string in a file

Hi all,

I have this script which I have been developing @ university, it is supposed to search a for a directory or a file within a directory - the front end form hasn't been designed yet.

Anyway, i'm concentrating on searching for a string within a file @ the moment as I have a deadline for 2 weeks. Within the While loop in the Else portion of the IF statement is supposed to find all occurences of the string assigned to $search, which could be anything - ultimately, but for test purposes I have assigned 'Test' as the string.

Up to yet I have the script working so that it checks all files in a directory for this string and returns the path of the file. When the script has printed the file path, I have it so that it prints the number of lines the search string was found on.

But, I need to get the script to function so that it does exactly above, but, alongside printing the line numbers the search term was found on (starting @ 1 not 0), it has to print the whole line of text with the search word in bold and uppercase.

Please could anyone provide me with some pointers as i'm tearing my hear out over it. I kind of understand how to go about it, but I cannot turn this logic into code.

Below is the code I have so far:

Quote:
[INDENT]<?php

//A recursive function: Displays content of directory passed as argument + all its sudirectories. //Displays directory names. Displays files as Hot URLs. Indents to show levels of nesting.
function display_directory($dir1)
{
STATIC $min_depth = 20;//initialised to greater than any likely depth

if(!is_string($dir1))
return FALSE;

//Calculate indentation for display:
//NB DOS slash used to split string into an array:
$depth = explode("\\", $dir1); //Make directory string into an array
$current_depth = sizeof($depth);//Count no. of elements in array
//First time through display_directory(), $min_depth will be set
//(Every other time, $current_depth will be >= min_depth)
if($current_depth < $min_depth)
$min_depth = $current_depth;

$dir = $dir1;
$dh = opendir($dir);
while ($file = readdir($dh))
{
//Don't print '.' & '..' directory entries:
if(($file != ".") && ($file != ".."))
{
$file = $dir."\\".$file; //NB DOS slash to join path & file names
//Do indentation:
for($i=0;$i <= ($current_depth - $min_depth); $i++)
//NB @ suppresses error messages: only way to test for DOS file v. directory
if($ddh = @opendir($file)) //NB @ means ‘suppress error messages’
{
//then $file is a directory
print "<b>Directory: $file</b><br>";
display_directory($file); //NB function calls itself recursively
}
else
{
$search = "test";
$srcfile = $file;
$TestString = @implode('', file($srcfile));
$nmatches = preg_match_all("/($search)/", $TestString, $matches);
if ($nmatches != '0') {
print "<a href=\"".$file."\">".$file."</a><br/>";
while(list($key,$val) = each($matches[1])){
print "<strong>Line</strong> "."<b>$key; </b><br>";
print "&nbsp;&nbsp;";
print "<strong>".strtoupper($val)."</strong>"."<br/>";
}
}

}
}//end if (!. && !..)
}//end while
closedir($dh);
return TRUE;
}//end of function

//Script START:
print "<b>Search string found in ".getcwd().":</b><br><br>";
display_directory(getcwd()); //Lists current working directory


?>
[/indent]
  Reply With Quote
Old 30-03-2005, 13:00   #2 (permalink)
Luke Redpath
Barney army!
 
Luke Redpath's Avatar
 
Join Date: Mar 2003
Location: London
Posts: 696
I've had a bit of spare time, so here's a nice class that encapsulates what you want to do (feel free to reuse it wherever you like). It looks lengthier than it is because of the comments and doc tags but these should help you understand the class.

I've broken down the functionality so you can provide a file to search (and it will just search for the search string in that file), or a directory, in which case it will recursively scan through the directory and sub-directories and look for the search term in each file it finds.

PHP Code:
/**
 * FileSystemStringSearch
 * Searches a file or directory of files for a search string
 */
class FileSystemStringSearch
{
    
//  MEMBERS
    
    /**
     * @var string $_searchPath path to file/directory to search
     */
    
var $_searchPath;
    
    
/**
     * @var string $_searchString the string to search for
     */
    
var $_searchString;
    
    
/**
     * @var array $_searchResults holds search result information
     */
    
var $_searchResults;
    
    
//  CONSTRUCTOR
    
    /**
     * Class constructor
     * @param string $searchPath path to file or directory to search
     * @param string $searchString string to search for
     * @return void
     */
    
function FileSystemStringSearch($searchPath$searchString)
    {
        
$this->_searchPath $searchPath;
        
$this->_searchString $searchString;
        
$this->_searchResults = array();
    }
    
    
//  MANIPULATORS
    
    /**
     * Checks path is valid
     * @return bool
     */
    
function isValidPath()
    {
        if(
file_exists($this->_searchPath)) {
            return 
true;
        } else {
            return 
false;
        }
    }
    
    
/**
     * Determines if path is a file or directory
     * @return bool
     */
    
function searchPathIsFile()
    {
        
// check for trailing slash
        
if(substr($this->_searchPath, -11)=='/' ||
           
substr($this->_searchPath, -11)=='\\') {
           return 
false;
        } else {
           return 
true;
        }
    }
    
    
/**
     * Searches given file for search term
     * @param string $file the file path
     * @return void
     */
    
function searchFileForString($file)
    {
        
// open file to an array
        
$fileLines file($file);
        
        
// loop through lines and look for search term
        
$lineNumber 1;
        foreach(
$fileLines as $line) {
            
$searchCount substr_count($line$this->_searchString);
            if(
$searchCount 0) {
                
// log result
                
$this->addResult($file$line$lineNumber$searchCount);
            }
            
$lineNumber++;
        }
    }
    
    
/**
     * Adds result to the result array
     * @param string $lineContents the line itself
     * @param int $lineNumber the file line number
     * @param int $searchCount the number of occurances of the search term
     * @return void
     */
    
function addResult($filePath$lineContents$lineNumber$searchCount)
    {
        
$this->_searchResults[] = array('filePath' => $filePath,
                                        
'lineContents' => $lineContents,
                                        
'lineNumber' => $lineNumber,
                                        
'searchCount' => $searchCount);
    }
    
    
/**
     * Takes a given string (usually a line from search results)
     * and highlights the search term
     * @param string $string the string containing the search term(s)
     * @return string
     */
    
function highlightSearchTerm($string)
    {
        return 
str_replace($this->_searchString,
                           
'<strong>'.$this->_searchString.'</strong>',
                           
$string);
    }
    
    
/**
     * Recursively scan a folder and sub folders for search term
     * @param string path to the directory to search
     * @return void
     */
    
function scanDirectoryForString($dir)
    {
        
$subDirs = array();
        
$dirFiles = array();
        
        
$dh opendir($dir);
        while((
$node readdir($dh)) !== false) {
            
// ignore . and .. nodes
            
if(!($node=='.' || $node=='..')) {
                if(
is_dir($dir.$node)) {
                    
$subDirs[] = $dir.$node.'/';
                } else {
                    
$dirFiles[] = $dir.$node;
                }
            }
        }
        
        
// loop through files and search for string
        
foreach($dirFiles as $file) {
            
$this->searchFileForString($file);
        }
        
        
// if there are sub directories, scan them
        
if(count($subDirs) > 0) {
            foreach(
$subDirs as $subDir) {
                
$this->scanDirectoryForString($subDir);
            }
        }
    }
    
    
/**
     * Run the search
     * @return void
     */
    
function run()
    {
        
// check path exists
        
if($this->isValidPath()) {
        
            if(
$this->searchPathIsFile()) {
                
// run search on the file
                
$this->searchFileForString($this->_searchPath);
            } else {
                
// scan directory contents for string
                
$this->scanDirectoryForString($this->_searchPath);
            }
         
         } else {
         
            die(
'FileSystemStringSearch Error: File/Directory does not exist');
         
         }
    }
    
    
//  ACCESSORS
    
function getResults()
    {
        return 
$this->_searchResults;
    }
    
    function 
getResultCount()
    {
        
$count 0;
        foreach(
$this->_searchResults as $result) {
            
$count += $result['searchCount'];
        }
        return 
$count;
    }
    
    function 
getSearchPath()
    {
        return 
$this->_searchPath;
    }
    
    function 
getSearchString()
    {
        return 
$this->_searchString;
    }


Here's some example usage:

PHP Code:
$searcher = new FileSystemStringSearch('/path/to/some/file/or/directory/''somesearchterm');
$searcher->run();

if(
$searcher->getResultCount() > 0) {
    echo(
'<p>Searched "'.$searcher->getSearchPath().'" for string <strong>"'.$searcher->getSearchString().'":</strong></p>');
    echo(
'<p>Search term found <strong>'.$searcher->getResultCount().' times.</strong></p>');
    echo(
'<ul>');
        foreach(
$searcher->getResults() as $result) {
            echo(
'<li><em>'.$result['filePath'].', line '.$result['lineNumber'].'</em>:<br />
                                            '
.$searcher->highlightSearchTerm($result['lineContents']).'</li>');
        }
    echo(
'</ul>');
} else {
    echo(
'<p>Searched "'.$searcher->getSearchPath().'" for string <strong>"'.$searcher->getSearchString().'":</strong></p>');
    echo(
'<p>No results returned</p>');


Just one thing to remember...to search through a directory, make sure the directory path has a trailing slash (or the class will think its a file).

Hope that helps.
__________________
Luke Redpath .::. Software Engineer .::. Reevoo - Real Reviews From Real Customers

Last edited by Luke Redpath : 31-03-2005 at 01:37.
  Reply With Quote
Old 31-03-2005, 17:41   #3 (permalink)
mattih5
Registered User
 
mattih5's Avatar
 
Join Date: Mar 2005
Location: Manchester
Posts: 62
Non function apparoahc

Hi, thanks for your reply - brilliant.

Only one problem, how do you approach this task in a non-functional way, i.e just so it can be implemented into my code above. All I am really stuck on is getting the else portion of the script to scan all files in a directory for a string, report the line number followed by a new line with the sentence the search term appears on in bold and uppercase. I can do all thr formatting (bold, uppercase), it is just the process of looping through the files and reporting the line number with the line containing the search string.

Thanks again.
  Reply With Quote
Old 31-03-2005, 17:58   #4 (permalink)
Luke Redpath
Barney army!
 
Luke Redpath's Avatar
 
Join Date: Mar 2003
Location: London
Posts: 696
Hi,

I created the class because I found your code quite difficult to read and understand - was quicker for me to write the class - it should be quite easy to take the class and implement it in place of your code (stick the class in a separate file and include it) - will make your code much easier, and hey its all part of the learning process, right?

In answer to your specific question, take a look at the following functions from the class.

For looping through lines of a given file (looping through files in a given directory and looping through directories recursively is handled elsewhere):

PHP Code:
function searchFileForString($file)
    {
        
// open file to an array
        
$fileLines file($file);
        
        
// loop through lines and look for search term
        
$lineNumber 1;
        foreach(
$fileLines as $line) {
            
$searchCount substr_count($line$this->_searchString);
            if(
$searchCount 0) {
                
// log result
                
$this->addResult($file$line$lineNumber$searchCount);
            }
            
$lineNumber++;
        }
    } 

Note that I create a variable $lineNumber which increments as I loop through each line in the file - I can use this variable to log the line number when I call the addResult() function (which saves a match to an array). That's the first part dealt with - saving the line number along with each search result (note that I also save the contents of the line too, as well as the file name/path and the number of times the search term was found on that line (in case it is repeated more than once in a given line).

In my presentation code (the "example usage" above), when I loop through the search results using foreach:

Code:
foreach($searcher->getResults() as $result) ...

, the resulting $result variable is a multidimentional array representing one result, with the following data available: filePath, lineContents, lineNumber, searchCount. Use $result['lineNumber'] to print out the line number. Use $result['lineContents'] to simply print out the contents of the line.

And now the bit relating to highlighting the search term...instead of just displaying $result['lineContents'], pass it into the class function highlightSearchTerm(). This will highlight the search term each time it appears on the line.

The simple bit of code which does that is this:

PHP Code:
str_replace($this->_searchString,
                           
'<strong>'.$this->_searchString.'</strong>',
                           
$string); 

Hope that explains things better.
__________________
Luke Redpath .::. Software Engineer .::. Reevoo - Real Reviews From Real Customers
  Reply With Quote
Old 02-04-2005, 02:57   #5 (permalink)
mattih5
Registered User
 
mattih5's Avatar
 
Join Date: Mar 2005
Location: Manchester
Posts: 62
What am I doing wrong here?

Does amybody know what i'm doing wrong here:

Quote:
$SearchString = "test";
print "<a href=\"".$file."\">".$file."</a><br/>";

$srcFile = "$file";
fileArray = @file($srcFile);
reset($fileArray);
while(list($key,$val)=each($fileArray)){
if (stristr($SearchString,$val)){
print htmlentities($ResultLine);
}
}

I keep getting an error with the while loop and i'm not sure if I have coded the IF statement correctly.
  Reply With Quote
Old 02-04-2005, 04:05   #6 (permalink)
Luke Redpath
Barney army!
 
Luke Redpath's Avatar
 
Join Date: Mar 2003
Location: London
Posts: 696
A couple of pointers...

First of all it would be a lot more helpful if you posted your code between PHP blocks instead of QUOTE blocks.

Second of all, it helps if you post the error message you are getting.
__________________
Luke Redpath .::. Software Engineer .::. Reevoo - Real Reviews From Real Customers
  Reply With Quote
Old 03-04-2005, 05:30   #7 (permalink)
mattih5
Registered User
 
mattih5's Avatar
 
Join Date: Mar 2005
Location: Manchester
Posts: 62
$SearchString is the test string I want to find every occurence of and $file is from a previous section of code which identifies all files within a directory. This is then hyperlinked in the code below.

PHP Code:
$SearchString "test";
print 
"<a href=\"".$file."\">".$file."</a><br/>";
$fileArray = array();
$srcFile "$file";
$fileArray = @file($srcFile);
reset($fileArray);
while(list(
$key,$val)=each($fileArray)){
    if (
stristr($SearchString,$val)){
        print 
htmlentities($ResultLine);
    }


The errors I keep getting are the following:

Quote:
Warning: reset(): Passed variable is not an array or object in c:\inetpub\wwwroot\oop\srchfile.php on line 83

and

Quote:
Warning: Variable passed to each() is not an array or object in c:\inetpub\wwwroot\oop\srchfile.php on line 84

I do not fully understand what this while loop is doing so I cannot help you with where the variables have come from. If anyone could help, it would be much appreciated.
  Reply With Quote
Old 03-04-2005, 12:05   #8 (permalink)
Luke Redpath
Barney army!
 
Luke Redpath's Avatar
 
Join Date: Mar 2003
Location: London
Posts: 696
OK, is there a reason why you are doing this:

PHP Code:
$srcFile "$file";
$fileArray = @file($srcFile); 

Can you not just do this?

PHP Code:
$fileArray file($file); 

There shouldn't be any need to reset the array either.

You shouldn't surpress errors when you can prepare for them instead. Try checking the file $file exists using the file_exists() function. Put that in an if...else statement, so if the file exists, carry on, if it doesn't, print out an error.

Usually if you are trying to read a file using file() and then you are getting errors about $fileArray not being an array, it sounds like it cannot find the file you are trying to open.
__________________
Luke Redpath .::. Software Engineer .::. Reevoo - Real Reviews From Real Customers
  Reply With Quote
Old 03-04-2005, 14:22   #9 (permalink)
mattih5
Registered User
 
mattih5's Avatar
 
Join Date: Mar 2005
Location: Manchester
Posts: 62
Still get error

Hi, i'm still getting errors, one in particular is this one:

Quote:
Warning: Variable passed to each() is not an array or object in c:\inetpub\wwwroot\oop\srchfile.php on line 83


I've modified the script to how you said:

PHP Code:
$SearchString "test";
print 
"<a href=\"".$file."\">".$file."</a><br/>";
$fileArray = array();
$fileArray file($file);
if (
file_exists($file)) {
    while(list(
$key,$val)=each($fileArray)){                        if (stristr($SearchString,$val)){
            print 
htmlentities($ResultLine);
        }
    }
} else {
    echo 
"Error";    



All I need it to do is search a directory for a string which exists within $searchString. The $file is a variable which is used back on in the recursive function with holds any file name. Once the search string is found the script needs to report back to the user the line number (starting @ 1) and then print out the whole line the string was found on. So, the script needs to search every file in the directory of which it is placed.
  Reply With Quote
Old 03-04-2005, 16:20   #10 (permalink)
Luke Redpath
Barney army!
 
Luke Redpath's Avatar
 
Join Date: Mar 2003
Location: London
Posts: 696
PHP Code:
$fileArray file($file);
if (
file_exists($file)) 

You need to do the file_exists check before opening it with file, so put $fileArray = file($file) inside your if statement.

PHP Code:
} else {
    echo 
"Error";    


Always best to try and be a bit more descriptive with your errors, so perhaps "Error: file not found" would be better.

Try this instead of the while loop:

PHP Code:
foreach($fileArray as $line) {
    if(
stristr($SearchString$line)) {
        print 
htmlentities($line);
    }
}


Have you not considered using the class I posted in my first reply? Its tested, documented and works. Might save you a lot of time, and the implementation is a piece of cake.
__________________
Luke Redpath .::. Software Engineer .::. Reevoo - Real Reviews From Real Customers
  Reply With Quote
Old 18-03-2008, 08:22   #11 (permalink)
Kellendil
Registered User
 
Join Date: Mar 2008
Posts: 1
Hi, I realise that this thread is very very old. But I was hoping someone could help me a bit with a problem.

The class Luke Redpath made works really really nice, but I want to expand a bit on it, and I'm just not sure how to do it.

What I want is to print more then just the line where I find the search term I'm looking for.

For instance, If I have a document like this:

KUNDENR........: XXXX
NAME: whatever
ADRESS: hello thar
EMAIL: myes@hi.thar

items bought:
993429349 Whatever.
995454542 Whatever2.

Now, I have a file filled with similar posts like that all under eachother (it is a long list of invoices), and I want to search for a spesific item, and print the whole post if I can find the itemnumber.

Curently if I search for 9788203186233, it will print:
Code:
filer/2007 (03) 10001-15000, line 104407: 9788203186233 FLAGGERMUSMANNEN 99,00 1 99,00 0% 99,00

But I want, for instance the 10 lines above that as well as the line with the search term.
(even more spesifically, I want to find the string: "KUNDENR........: XXXX" that precedes the search string, so I can find all the customers that has bought a spesific item)


I hope that made enough sense, if not, please ask any questions you need.

Last edited by Kellendil : 18-03-2008 at 08:34.
  Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search


Contact Us - Web Design Forums - Archive - Top
Search Engine Optimization by vBSEO 3.0.0 RC8