Hi
In my application I am experimenting with several comparing methods as Like, =, == at, $ etc.
Exist some diferencies when comare many many strings (milions rows) that I search method which is better for use.
For example like(text1,text2) is slower than at(text1,text2)>0 and this also slower than text1 $ text2
when compare about 10000000x then time to processing is 2.6 seconds or 1.81 seconds or 1.52 seconds.
If my processes running several hours this time is important and better method spare some time for me.
Exist some better way to comparing text that I shown ?
Also I experimented with compare by parts, first compare first letter, and only when this identical then compare full string, here are also some seconds to spare, but still exist better solution
or divide searched key to letters and compare as letter or ascii numbers, but this is not effective.
If somebody have some advice, I will be gratefull.
Note : I comparing strings in beginning row, but also in any place in row., still in one row, this is not full text searching. Text have own format, with some sign in begin rows etc.
function read rows in cycle to all text file and compare key text if is it in this row , by the way text file is loaded in array.
text can be as this :
searched text ABCDE
row : bla bla bla .... ABCDE bla bla bla
or
row : ABCDE bla bla bla...
Has some function better performance as $ ?
String compare
Re: String compare
Are you using a client server such as ADS or Sql Server?
The eXpress train is coming - and it has more cars.
Re: String compare
hi,
File or Memo
---
do you want to make a "full text search" in Database
i use Xbase++ "Custom Index" for FTS in Memo
so where is Text Original fromVictorio wrote:by the way text file is loaded in array.
File or Memo
---
do you want to make a "full text search" in Database
i use Xbase++ "Custom Index" for FTS in Memo
greetings by OHR
Jimmy
Jimmy
Re: String compare
Original text is in text ascii file. This read to string variable and then read from this variable by rows where I searching EOL to know where row ended.
File are sometimes very large hudrets MBytes then Iust divide to 100MB parts and process separate.
I am searching only method for better performance whem comparing strings.
Other method as store to ads and other need tottaly reprogramming.
Maybe routine in C, C++ cam be solution or remove text searching module to external C++ module as I had many years ago whem application was in Ca Clipper.
Or create xbaseodule but not gui to better performance and call it as external module with runshell but this is not very clear
File are sometimes very large hudrets MBytes then Iust divide to 100MB parts and process separate.
I am searching only method for better performance whem comparing strings.
Other method as store to ads and other need tottaly reprogramming.
Maybe routine in C, C++ cam be solution or remove text searching module to external C++ module as I had many years ago whem application was in Ca Clipper.
Or create xbaseodule but not gui to better performance and call it as external module with runshell but this is not very clear
Re: String compare
If you are using a client server in your app then a SQL SELECT statement is much faster than a locate or set filter.
The eXpress train is coming - and it has more cars.
Re: String compare
have you think about "regular Expression"Victorio wrote:Original text is in text ascii file.
there is a Sample from Phil Ide
XbPCRE - PCRE (Perl Compatible Regular Expression) Library for Xbase++
greetings by OHR
Jimmy
Jimmy
Re: String compare
Roger, no , I am not using client server. (or at this moment this is not important, because I need optimize low level text processing)rdonnay wrote:If you are using a client server in your app then a SQL SELECT statement is much faster than a locate or set filter.
I work simply with text files stored on disk.
I have some text keys, which I search in this text, this keys can be 1, 2 or thousands.
And I compare every rows this text file with all keys, if are identical, or if key is somewhere in this row.
When text file has 1000000 rows, and compare 1 key, time is some seconds.
But when keys are 10000 , need 1000000x10000 comparings and this requires minutes or hours.
Because I need eliminate any useless operations and any millisecond is important for me.
Jimmy: I will look for this sample Thanks
Re: String compare
Hi,
Do you need to know which text files include your key, or do you need the row in the text file?
If there are multiple keys, do you need to know if ALL keys are in the text, or if ONE of the keys is in the text?
Can you post a sample text file and sample keys to search for?
Do you need to know which text files include your key, or do you need the row in the text file?
If there are multiple keys, do you need to know if ALL keys are in the text, or if ONE of the keys is in the text?
Can you post a sample text file and sample keys to search for?
Re: String compare
Sample keys can be random, sometimes use formatted keys as :skiman wrote:Hi,
Do you need to know which text files include your key, or do you need the row in the text file?
here is sampleIn text file are blocks beginning with "| POLOZKA VZ ", this is one of control "words" by this I know where block begin and ended whereCode: Select all
================================================================================ | POLOZKA VZ : 1 / 2000, riadok :3 | 03.01.2000 o 11:15: 4| Kod :24342 ================================== VLASTNICI PARCIEL -- ZRUSENIE -- LIST VLASTNICTVA c: 1286 , spoluvl.: 1 Cislo LV Stare: 1286 Por.c.spoluvl. Stare: 1 Citatel vlast.pod. Stare: 1 Menovatel vlast.pod. Stare: 1 Polozka VZ Stare: 100 Meno,adr.vlastnika Stare: pokusny testovaci zaznam po 1.1.2000 Kontr.kod Stare: 14141 ================================================================================ | POLOZKA VZ : 2 / 2000, riadok :7 | 02.02.2000 o 7:40:46| Kod :252 ================================== VLASTNICI PARCIEL -- ZRUSENIE -- LIST VLASTNICTVA c: 219 , spoluvl.: 1 Cislo LV Stare: 219 Por.c.spoluvl. Stare: 1 Citatel vlast.pod. Stare: 6 Menovatel vlast.pod. Stare: 9 Polozka VZ Stare: 3199 Typ identifikatora Stare: 3 ICO ,rod.c. Stare: 19365817 Meno,adr.vlastnika Stare: SMOTER LADISLAV A MARIA R LENCESOVA KOSICE Kontr.kod Stare: 53722
"================================================================================"
when reading, to temporary variable I save block rows, beginning with POLOZKA... and ending with ======
now I am searching text and in this block search if content any from x keys,
if found key in any row this block I save full this block to out and go read another block.
If there are multiple keys, do you need to know if ALL keys are in the text, or if ONE of the keys is in the text?
I am using combination, all keys and sometimes one of the keys, or sometimes are keys in pair , and must found both this pair and all or any from .
Can you post a sample text file and sample keys to search for?
* "ID.CISLO STAVBY c: "+alltrim(str(ICS))
* "ID.C.PRAVNEHO VZTAHU : "+alltrim(str(IDC))
* "ID.CISLO PRIESTORU c: "+alltrim(str(ICP))
* "C-PARCELY -- AKTUALIZACIA -- PARCELA c: "+alltrim(prevpar5(str(CPA)))
* "E-PARCELY -- AKTUALIZACIA -- PARCELA c: "+alltrim(prevpar5(str(CPA)))
* "E-PARCELY -- AKTUALIZACIA -- PARCELA c: "+alltrim(prevpar5(str(CPA)))+" , Umiest.: "+alltrim(str(CPU))
but sometimes searched text can be any word, number, or combination of letters, numbers, and any signs.