CSV Parser for PHP
fgetcsv() sucks because it doesn't handle quoted newlines. So I wrote a simple CSV parser class for PHP myself, based on the JavaScript version that I wrote earlier (it doesn't handle quoted newlines either). Not prefect (e.g. newlines are standardized to \n, and string matching works only for ASCII or UTF-8), but does what I want it to do.
/** * CSV file parser * Currently the string matching doesn't work * if the output encoding is not ASCII or UTF-8 */ class CsvFileParser { var $delimiter; // Field delimiter var $enclosure; // Field enclosure character var $inputEncoding; // Input character encoding var $outputEncoding; // Output character encoding var $data; // CSV data as 2D array /** * Constructor */ function CsvFileParser() { $this->delimiter = ","; $this->enclosure = '"'; $this->inputEncoding = "ISO-8859-1"; $this->outputEncoding = "ISO-8859-1"; $this->data = array(); } /** * Parse CSV from file * @param content The CSV filename * @param hasBOM Using BOM or not * @return Success or not */ function ParseFromFile( $filename, $hasBOM = false ) { if ( !is_readable($filename) ) { return false; } return $this->ParseFromString( file_get_contents($filename), $hasBOM ); } /** * Parse CSV from string * @param content The CSV string * @param hasBOM Using BOM or not * @return Success or not */ function ParseFromString( $content, $hasBOM = false ) { $content = iconv($this->inputEncoding, $this->outputEncoding, $content ); $content = str_replace( "\r\n", "\n", $content ); $content = str_replace( "\r", "\n", $content ); if ( $hasBOM ) // Remove the BOM (first 3 bytes) { $content = substr( $content, 3 ); } if ( $content[strlen($content)-1] != "\n" ) // Make sure it always end with a newline { $content .= "\n"; } // Parse the content character by character $row = array( "" ); $idx = 0; $quoted = false; for ( $i = 0; $i < strlen($content); $i++ ) { $ch = $content[$i]; if ( $ch == $this->enclosure ) { $quoted = !$quoted; } // End of line if ( $ch == "\n" && !$quoted ) { // Remove enclosure delimiters for ( $k = 0; $k < count($row); $k++ ) { if ( $row[$k] != "" && $row[$k][0] == $this->enclosure ) { $row[$k] = substr( $row[$k], 1, strlen($row[$k]) - 2 ); } $row[$k] = str_replace( str_repeat($this->enclosure, 2), $this->enclosure, $row[$k] ); } // Append row into table $this->data[] = $row; $row = array( "" ); $idx = 0; } // End of field else if ( $ch == $this->delimiter && !$quoted ) { $row[++$idx] = ""; } // Inside the field else { $row[$idx] .= $ch; } } return true; } }
26 Comments:
Note that troll and spam comments will be deleted without any notification.
At 11/09/2006 09:31:00 pm, Anonymous said…
This was exactly the thing I was looking for. Thank you for saving me so much time!
At 11/23/2006 11:06:00 pm, jimeh said…
hey, good job on your class :)... i used the processing concept of your class to write my own which handles multiline quotes, and quotes within quoted values. hope you don't mind :)
http://zhuoqe.org/blog/2006/11/23/code-yard-and-php-csv-parser/
At 12/20/2007 01:25:00 am, Anonymous said…
Hi,
you made a great job with that class.
I have added some code to optionally change the output format and use associative arrays or even dynamically created objects (PHP5 required).
I hope you don't mind and someone finds the changes useful :D
View source code
--- hippydream ---
At 12/20/2007 01:36:00 am, minghong said…
Of course I don't mind…! ;-)
At 1/11/2008 01:39:00 am, Anonymous said…
Dear Sir:
Hi. I have seen your posts on Wikipedia and think they are great. I know you are very knowledgable and hope you don't mind this question. I was wondering if you might have any idea where an American might buy software to translate English into bo-po-mo-fo pronunciation of Chinse. Does such a thing exist for Palm Pilots?
Thanks
Diane Michel
At 9/10/2008 06:19:00 am, Anonymous said…
Here is a much faster and simpler function to parse a CSV file. (I benchmarked the original class at around 9 seconds for a 3mb file with 10,000 lines. This function does the same file in 0.60 seconds).
define('MAX_CSV_LINE_LENGTH', 7000);
//Uses fgetcsv() to parse a CSV file.
function csvParseFile($filename, $separator = "\t") {
$fp = fopen($filename, 'r');
if ($fp === false) {
return false;
}
$result = array();
while ($line = fgetcsv($fp, MAX_CSV_LINE_LENGTH, $separator)) {
array_push($result, $line);
}
fclose($fp);
return $result;
}
You can contact me at trusty_jim@hotmail.com
At 9/10/2008 09:36:00 am, minghong said…
The point is that fgetcsv doesn't handle quoted newlines. So it's not about the speed…
At 3/02/2009 03:12:00 pm, Anonymous said…
Alternatively you could use this package at PEAR
http://pear.php.net/package/File_CSV_DataSource
At 10/13/2009 07:55:00 am, Unknown said…
Hi, firstly many thanks about your job! Secondly, i am a CS student from Greece and i need some help, if yoy pleased. In my application(not for commercial use), the tables and their fields will be created dynamically. I used csv's headers for that purpose an i added another field "id". After the creation of the table i need to import the data to table. I use the following script:
$k=0;
$j=1;
//Insert Data to table
foreach ($csv->data as $key => $row){
$k=0;
foreach ($row as $value) {
echo "
$value
";
$sql= "INSERT INTO inouts ('id','$cols[$k]')
VALUES ($j,$value)";
// Execute query
$result=mysql_query($sql,$link);
if(mysql_error()){
echo mysql_error();
}
$k++;
}
$j++;
}
And i get the following error:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''id','DMUS')
VALUES (1,Brighton)' at line 1
4453
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''id','INPUT1')
VALUES (1,4453)' at line 1
5456
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''id','INPUT2')
VALUES (1,5456)' at line 1
etc...
Thank you in advance, sorry about the huge post.
At 10/13/2009 10:23:00 am, minghong said…
You don't have single quotes around the string values.
e.g. VALUES (1,Brighton) should be VALUES (1,'Brighton')
Also, you should use mysql_real_escape_string() to escape the string values, so as to avoid SQL syntax error and SQL injection.
At 10/13/2009 08:50:00 pm, Unknown said…
Thank you for your reply, i did some changes as you proposed but i still get the same error.
&j=1;
foreach ($csv->data as $key => $row){
$k=0;
foreach ($row as $value) {
$sql= "INSERT INTO inouts ('id','$colmns')
VALUES (
'". mysql_real_escape_string($j) ."',
'". mysql_real_escape_string($value) ."' )";
// Execute query
$result=mysql_query($sql,$link);
if(mysql_error()){
echo mysql_error(); }
$k++;
}
$j++;
}
At 10/13/2009 08:55:00 pm, Unknown said…
I am sorry i missed that :
$colmns=mysql_real_escape_string($cols[$k]);
before the $sql statement.
At 10/08/2010 11:50:00 pm, SimonSimCity said…
This comment has been removed by the author.
At 10/08/2010 11:53:00 pm, SimonSimCity said…
Hi, all
Here's my solution of an mysql-importer:
function prepare_db_strings( &$string ) {
$string = mysql_real_escape_string( $string );
}
mysql_query("TRUNCATE `$tableName`;");
$query = "INSERT INTO `$tableName` VALUES ";
foreach ( $csvParser->data as $row ) {
array_walk($row, 'prepare_db_strings');
$query .= ' ("'.implode($row, '", "').'"),';
}
$query = substr($query, 0, -1);
mysql_unbuffered_query($query);
At 10/11/2010 03:52:00 pm, SimonSimCity said…
Hi, all
I think fgetcsv() can handle quoted-newlines now ...
I've tested the following csv-file:
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
Here's my php-code:
<?php
$res = fopen('test.csv', "r");
while (($data = fgetcsv($res)) !== FALSE) {
var_dump($data);
}
fclose($res);
?>
At 10/11/2010 03:59:00 pm, minghong said…
Maybe it got improved… After all, it has been 4 years already.
At 10/11/2010 04:18:00 pm, SimonSimCity said…
Hi, minghong
Yes - my example (copied from wikipedia) works.
I got a large csv-file and it crashed using fgetcsv() ...
I've also tried your php-class. It took a long time but it worked!
At 4/02/2011 12:55:00 am, Unknown said…
I've had limited results from this script to achieve what im looking for ... maybe someone can help. i need the php script to scan multiple csv files (preferably all in a folder), turn the data from each file into an array, then combine the data from each csv file and print it on the screen. i have a lot of csv files what aren't formatted properly but the data can be put into an array ... i'd like to use the data, fix it, and put everything back into 1 clean csv file.
At 4/02/2011 01:02:00 am, minghong said…
That should be simple. Just use a PHP library/class/function for reading the file and implement the logic you want.
Beside this class, you may like to give PHPExcel a try.
At 7/27/2011 10:51:00 pm, SimonSimCity said…
Hi, all
The only thing missing for me is a better handling of memory :)
Since there's no better way (as far as I know) I'd add a new parameter where you can insert a function that should be called when the scripts has finished reading a row.
Using this fix you can import allmost every csv-file using less memory.
If you don't know how to do: Here's a nice hint:
http://www.php.net/manual/de/function.call-user-func-array.php
p.s. maybe you'll find an option to replace file_get_contents() with fgets() to use even less memory :)
At 8/17/2011 11:23:00 am, Paid Critique said…
what could be the problem if everytime you upload an image it always display
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near
??????
Any help?
At 6/20/2012 06:05:00 am, Christopher Stevens said…
Six years later: This csv parser worked great for what I needed. Thanks minghong for posting!!
At 8/25/2013 01:59:00 am, Thomas said…
Is there a posibility to use semicolon istead of comma? i change it in the var $delimiter = ';'; but it doesnt work :(
At 12/24/2013 10:48:00 pm, Anonymous said…
check parsecsv-for-php:
http://code.google.com/p/parsecsv-for-php/
At 8/12/2019 05:58:00 pm, Eagle Focus(Hong Kong) Limited said…
nice post
At 11/25/2020 10:21:00 am, online erectile dysfunction prescription said…
I'm now not sure where you are getting your information, however good topic. I must spend some time studying more or working out more. Thank you for wonderful info I used to be looking for this info for my mission.
Post a Comment
<< Home