String compare

Xbase++ 2.0 Build 554 or later
Post Reply
Message
Author
Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

String compare

#1 Post by Victorio »

Hi,

I work on comparising string again. In my app I have function for convert Latin2 (CP1250) to ASCII, but it is slow ( DO CASE...ENDCASE), without converting my function is about 20x better speed.

Then I examine SetLexRule() where I have national characters.

if I put something like :

SetLexRule(.... array with converted characters)
SET LEXICAL ON
x1="fén"
x2="fen"
? x1 = x2 .T.
? at(x1,x2)!=0 .F. ???? Why :?:

SET LEXICAL ON have not effect to AT( ) ?

User avatar
Auge_Ohr
Posts: 1407
Joined: Wed Feb 24, 2010 3:44 pm

Re: String compare

#2 Post by Auge_Ohr »

Victorio wrote:I work on comparising string again. In my app I have function for convert Latin2 (CP1250) to ASCII, but it is slow ( DO CASE...ENDCASE), without converting my function is about 20x better speed.
hm ... is Latin2 (CP1250) your local Source from Windows OS() ? which OS() ?
not sure why you want to convert it to ASCII and not ANSI ?
Victorio wrote:

Code: Select all

? at(x1,x2)!=0 .F. ???? Why :?:
SET LEXICAL ON have not effect to AT( ) ?
Help File say that it work only on

Code: Select all

=
>=
<=
<>
it also does not work with

Code: Select all

==
which compares binaries
greetings by OHR
Jimmy

Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

Re: String compare

#3 Post by Victorio »

Hi,

hmm... At() is unusable for it.. ok, thanks.

"hm ... is Latin2 (CP1250) your local Source from Windows OS() ? which OS() ?
not sure why you want to convert it to ASCII and not ANSI ?"

I have many databases, where is mix of code pages used, Latin II (CP852) and also Latin II Windows 1250.

If I want search some text in this data (name, surname, or other text) and text in database may be wrote with or without national characters (from code page asc 128 - 255) I must convert it to only ASCII characters (characters from ASCII code 0 - 127)
Example :

I need search name Čičvakovič
in database can be Čičvakovič, or CICVAKOVIC, or Cicvakovic,......

First I convert search key to CICVAKOVIC, and then search in database also every row converted to UPPER (LAT_IBM(row))

do while eof()
searchkey=upper(lat_ibm(searchkey))
If AT(searchkey,upper(lat_ibm(row))!=0
* name found, list to report...
endif
enddo

(I do not know If you view correct characters which I write here...)

Function LAT_IBM is in attach

Today I modify it to process DO CASE only for asc code 128 - 255 , then speed my function is 3x better.
With use LAT_IBM search time is for example 7 seconds,
Without use LAT_IBM 1 second.

I hope, some modifications can help to better speed.

Victorio
Attachments
LAT_IBM.ZIP
(1.38 KiB) Downloaded 777 times

Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

Re: String compare

#4 Post by Victorio »

Now I test also this :

#define sdiak "áäčďéěíĺľňóôőöŕšťúůűüýřžÁÄČĎÉĚÍĹĽŇÓÔŐÖŔŠŤÚŮŰÜÝŘŽ"
#defin bdiak "aacdeeillnoooorstuuuuyrzAACDEEILLNOOOORSTUUUUYRZ"

FUNCTION lat_ibm_u2(ptext)
LOCAL outriadok:=space(0),pocet:=0,i:=0,znak:=0
pocet:=len(ptext) && zistenie poctu znakov v riadku
for i=1 to pocet
pom:=at(ptext,sdiak)
if pom=0
outriadok+=ptext
else
outriadok+=bdiak[pom]
endif
next
RETURN outriadok

But this is 2x slower than LAT_IBM with DO CASE.... ENDCASE syntax , I mean, this is for slow AT() function.
Also substr() is slow, not usable for this, so as string manipulations functions.

? Can be function compile as DLL better speed ?

User avatar
Auge_Ohr
Posts: 1407
Joined: Wed Feb 24, 2010 3:44 pm

Re: String compare

#5 Post by Auge_Ohr »

I have many databases, where is mix of code pages used, Latin II (CP852) and also Latin II Windows 1250
CP852 = DOS = OEM
CP1250 = Windows = ANSI
have you try ConvToAnsiCP() / ConvToOemCP() ?

Xbase++ App as VIO (like Cl*pper) or GUI ( DBF -> ConvToXXXX -> Screen ) ?
what Type of Index ? NTX or CDX ?

i would use a Array with $ and STRTRAN()

Code: Select all

FUNCTION Translate(cString)
LOCAL aOEM := { {CHR(oem), CHR(ansi)}, .... } // use Number with CHR() !!!
LOCAL i,iMax := LEN(aOEM)

  FOR i := 1 TO iMax
    IF aOEM[i][1] $ cString
       cString := STRTRAN(cString,aOEM[i][1],aOEM[i][2])
    ENDIF
  NEXT    
RETURN cString
greetings by OHR
Jimmy

Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

Re: String compare

#6 Post by Victorio »

Hi Jimmy,

I tryed this like
FOR i := 1 TO iMax
IF aOEM[1] $ cString
cString := STRTRAN(cString,aOEM[1],aOEM[2])
ENDIF
NEXT
but best speed has DO CASE algorithm
I figured out, this is because every STRxxx function is slower than some mathematic comparing (IF ELSE, DO CASE,,,, chr(xxx)=...
I mean if I compare string, then compare with :
x="ABC"
if chr(x[1])+chr(x[2])+chr(x[3])"="ABC" is faster then at(x,"ABC")

I tryed ConvToAnsiCP() / ConvToOemCP() , used in program for correct show text in OEM.
I use CDX, but compare I need for text files.
I use text file , no database because searching in it is quickly, and need less disk space.
For example DBF file with FPT file need 100MB, after convert to TXT and compressing with lempel ziv algorithm need only 10MB.
This is because in dbf every record also if it is empty use space, but text file no.

But this is no my priority, now I need solve other problematic tasks as client/server,sql and virtual server....
:roll: because my client want install application on virtual server....

User avatar
Auge_Ohr
Posts: 1407
Joined: Wed Feb 24, 2010 3:44 pm

Re: String compare

#7 Post by Auge_Ohr »

Victorio wrote:I use text file , no database because searching in it is quickly, and need less disk space.
For example DBF file with FPT file need 100MB, after convert to TXT and compressing with lempel ziv algorithm need only 10MB.
This is because in dbf every record also if it is empty use space, but text file no.
how big is your HDD / SDD ... :roll:

hm ... you are using Text Files ... so your Source is from different OEM (DOS) / ANSI (Win) App ?

about STRTRAN() : you can use hole Text File as String up to 2GB to change Sign.
when your Text File contain 1000 x CHR(x) all will be transform "on-fly" by

Code: Select all

cString :=  STRTRAN(cString,CHR(x),CHR(y))
other Way : there are CP852 <-> CP1250 Converter ...
Victorio wrote:But this is no my priority, now I need solve other problematic tasks as client/server,sql and virtual server....
:roll: because my client want install application on virtual server....
what do you want to use as SQL Server ?
how do you want to access SQL Server from Xbase++ / Express++ ?

but your Problem remain if your Source are OEM/ANSI Text Files which you have to import into SQL Server.
greetings by OHR
Jimmy

Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

Re: String compare

#8 Post by Victorio »

Hi,
I have 500GB disk on my home PC and external 1TB, :) but my clients have on servers only 400-800GB disks, and free space only about 100-150GB :roll:
And because this I convert DBF file to txt, for example 2GB text files will be in original source 20GB.
And in center where is data from the wholy country is data in DBF about 1500GB and it is about 70000 files... "little" problem for managing this.

Yes, source is from DOS version Foxpro and also from Visual Foxpro. And also in databases some data is without punctuation, and some with it, then I must before search in data remove punctuation from search key and compare it with data also without it.
If found, I generate protocol with punctuation...a bit complicated but I can not figure out a better

SQL server I want use only for managing processes, to comunicate between local client application and server application.
User put your job to database, now I have it in DBF (CDX) format , and server module read this database and processing jobs.
Data still stay in TXT format!!!

I want also add some web access, where user also only put parameters for job, and cetral server start process this.
But it is only my vision for the future.
(Users are from all country, and server is in capital...)

User avatar
Auge_Ohr
Posts: 1407
Joined: Wed Feb 24, 2010 3:44 pm

Re: String compare

#9 Post by Auge_Ohr »

found this in Alaska Newgroup which may help you

Re: Slow UpperX function
public.xbase++.generic
22. März 2016
Andreas Gehrs-Pahl

Code: Select all

#include "NLS.ch"

Procedure SetUpperX()
LOCAL cLocale := LocaleConfigure(LOCALE_TO_UPPER)
LOCAL cLower  := "ƒÆ …µ¶Ç·‚ˆŠÒÔ¡ÖÞ¢“ä•àâå㣗é뚁€‡¥¤"
LOCAL cUpper  := "AAAAAAAAEEEEEEIIIIOOOOOOOOUUUUUUCCNN"
LOCAL nCount  := len(cLower)
LOCAL nChar   := 0
   for nChar := 1 to nCount
      cLocale[1 + asc(cLower[nChar])] := cUpper[nChar]
   next nChar
   LocaleConfigure(LOCALE_TO_UPPER, cLocale) // look at this !
return
greetings by OHR
Jimmy

Victorio
Posts: 621
Joined: Sun Jan 18, 2015 11:43 am
Location: Slovakia

Re: String compare

#10 Post by Victorio »

I will look at this, thank you.

Post Reply