Sampling data from CSV to DBF

This forum is for eXpress++ general support.
Message
Author
User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Sampling data from CSV to DBF

#1 Post by Eugene Lutsenko »

How do I make a row-by-row sample of data from CSV to DBF? The text reproduced below does not work: it turns out only one record in dbf

Code: Select all

   oScrn   := DC_WaitOn( L('Заполнение БД "Inp_data.dbf" даными из файла: "train.csv"' ))

   CLOSE ALL

   nHandle := DC_txtOpen( 'train.csv' )
   USE Inp_data EXCLUSIVE NEW
   SELECT Inp_data

   DO WHILE !DC_TxtEOF( nHandle )                   // Начало цикла по строкам

      mLine = DC_TxtLine( nHandle )                 // Выделить строку из текстового файла

      APPEND BLANK

      mNFields = NUMTOKEN(mLine,",")
      FOR j=1 TO mNFields
          mWord = ALLTRIM(TOKEN(mLine,",",j))
          FIELDPUT(j, IF(j=1,mWord,VAL(mWord)))
      NEXT

      DC_TxtSkip( nHandle, 1 )
   ENDDO
   DC_TxtClose( nHandle )

   DC_Impl(oScrn)                                                   
[/size]

User avatar
sdenjupol148
Posts: 151
Joined: Thu Jan 28, 2010 10:27 am
Location: NYC

Re: Sampling data from CSV to DBF

#2 Post by sdenjupol148 »

Hi Eugene,

Have you looked at DC_Csv2Workarea() and DC_Csv2Array()?
You can find some samples in \exp19\samples\csv

Regards,

Bobby

User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Re: Sampling data from CSV to DBF

#3 Post by Eugene Lutsenko »

Thanks, I'll see. Once looked and tried a lot of different options. But already forgot about it

User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Re: Sampling data from CSV to DBF

#4 Post by Eugene Lutsenko »

It's not working yet. Standard tools do not suit me, because in a CSV-file about 5000 fields and I make them into a sample in dbf, which can be a maximum of about 1700 fields. Csv files can have several million lines. I can't add an entry to Inp_data.dbf!

Code: Select all

   oScrn   := DC_WaitOn( L('Заполнение БД "Inp_data.dbf" даными из файла: "train.csv"' ))

   CLOSE ALL

   nHandle := DC_txtOpen( 'train.csv' )
   DC_TxtSkip( nHandle, 1 )

   USE Inp_data EXCLUSIVE NEW
   SELECT Inp_data

   DO WHILE !DC_TxtEOF( nHandle )                   // Начало цикла по строкам

      mLine = DC_TxtLine( nHandle )                 // Выделить строку из текстового файла

      aFieldVol := {}
      mNFields = NUMTOKEN(mLine,",")
*     FOR j=1 TO mNFields
      FOR j=1 TO 1500
          mWord = ALLTRIM(TOKEN(mLine,",",j))
          AADD(aFieldVol, mWord)
      NEXT

*     SELECT Inp_data
*     APPEND BLANK
      Inp_data->(DBAPPEND())
      FOR j=1 TO LEN(aFieldVol)
*         FIELDPUT(j, IF(j=1,aFieldVol[j],VAL(aFieldVol[j])))
          mFN = 'N'+ALLTRIM(STR(j))
          REPLACE Inp_data->&mFN WITH IF(j=1,aFieldVol[j],VAL(aFieldVol[j]))
      NEXT

      DC_TxtSkip( nHandle, 1 )
   ENDDO
   DC_TxtClose( nHandle )

   DC_Impl(oScrn)                                                   

[/size]

User avatar
rdonnay
Site Admin
Posts: 4868
Joined: Wed Jan 27, 2010 6:58 pm
Location: Boise, Idaho USA
Contact:

Re: Sampling data from CSV to DBF

#5 Post by rdonnay »

If you can give me your CSV file and an empty DBF file, I will see what I can do.
The eXpress train is coming - and it has more cars.

User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Re: Sampling data from CSV to DBF

#6 Post by Eugene Lutsenko »

rdonnay wrote:If you can give me your CSV file and an empty DBF file, I will see what I can do.
Greetings, Roger!
These are: "train.csv" and "test.csv" files, that can be downloaded here:
https://www.kaggle.com/c/santander-valu ... lenge/data
A DBF file has the same fields as a total of 4993. However, it is probably impossible to create a single DBF file with so many fields. Maximum, that I have steadily is obtained - 1,500 fields. So I'm probably going to create many DBF files 1000 fields associated with relationship one-to-one how to do this. In a DBF file, all fields except the 1st are numeric with 1 decimal place. 1st field is a text of 30 characters.

Code: Select all

   aStructure := { { aFieldName[1], "C", 30, 0 },;      // ID
                   { aFieldName[2], "N", 15, 1 } }      // TARGET

*  FOR j=3 TO LEN(aFieldName)-2
   FOR j=3 TO 1500
       mFN = 'N'+ALLTRIM(STR(j))
       AADD(aStructure, { mFN, "N", 15, 1 })
*      AADD(aStructure, { aFieldName[j], "N", 15, 1 })
   NEXT
   DbCreate( 'Inp_data.dbf', aStructure )
[/size]
Maybe there is a possibility to use some other database standard (not DBF), in which there is no such hard limit on the number of fields?
Attachments
Inp_name.zip
The names of the fields in the CSV file
(29.5 KiB) Downloaded 951 times
Last edited by Eugene Lutsenko on Sun Jun 24, 2018 9:44 am, edited 1 time in total.

User avatar
Auge_Ohr
Posts: 1444
Joined: Wed Feb 24, 2010 3:44 pm

Re: Sampling data from CSV to DBF

#7 Post by Auge_Ohr »

Eugene Lutsenko wrote:A DBF file has the same fields as a total of 4993.
DBF are wrong Database for so many Fields ...

neverless i wonder why you have so many Fields ... what about to use a Array and store it into Memo Type "V" (Var2Bin)
greetings by OHR
Jimmy

User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Re: Sampling data from CSV to DBF

#8 Post by Eugene Lutsenko »

Auge_Ohr wrote:
Eugene Lutsenko wrote:A DBF file has the same fields as a total of 4993.
DBF are wrong Database for so many Fields ...

neverless i wonder why you have so many Fields ... what about to use a Array and store it into Memo Type "V" (Var2Bin)
Hi, Jimmy!
So many fields because I often solve large-dimensional problems: "big data". The array is not suitable, because there are a lot of observations - millions. With such a number of fields-this database that barely fit in 2GB, and sometimes do not fit. I was processing on my computer the largest database of 100,000 records per 100,000 fields. This database was created a little more than half an hour and had a size of 239 GB. I used my own database standard to process such data, as ADS does not support it either.

PS
My colleague has developed a module for parallel processing of information for high-speed synthesis and verification of large-scale models. This module uses graphics cards with an NVIDIA chip for non-graphical computing. But I have not used this module yet. It is in the stage of fine-tuning to the level when you can actually use it.

User avatar
Auge_Ohr
Posts: 1444
Joined: Wed Feb 24, 2010 3:44 pm

Re: Sampling data from CSV to DBF

#9 Post by Auge_Ohr »

Eugene Lutsenko wrote:The array is not suitable, because there are a lot of observations - millions.
if you have so many data you should think about reduce/compress it or change data format.

you can use a Bitmap where each Pixel can have Value 0 - 16777216.
a 4K have 2000x2000 = 4.000.000 Pixel in a single Bitmap and need less Space than a 2000x2000 Array
Eugene Lutsenko wrote:My colleague has developed a module for parallel processing of information for high-speed synthesis and verification of large-scale models. This module uses graphics cards with an NVIDIA chip for non-graphical computing. But I have not used this module yet. It is in the stage of fine-tuning to the level when you can actually use it.
did he wrote a Interface for Xbase++ :roll:

what about running multi-Instance of your App while modern PC have more than 1 CPU
of course your App must be Network able when share same Database.
greetings by OHR
Jimmy

User avatar
Eugene Lutsenko
Posts: 1649
Joined: Sat Feb 04, 2012 2:23 am
Location: Russia, Southern federal district, city of Krasnodar
Contact:

Re: Sampling data from CSV to DBF

#10 Post by Eugene Lutsenko »

Hi, Jimmy!

Yes, he has written an interface that provides the use of his module from any program written in any language or even manually.

My system is available online: http://lc.kubagro.ru/aidos/_Aidos-X.htm (use https://translate.yandex.ru/translate)

As for the use of video card memory - this is a good idea. But in this case it will not help. Another good idea is to treat all data of any nature as images in multidimensional space. I use it in my system.

Post Reply