Pytables time performance -


i'm working on project related text detection in natural images. have train classifier , i'm using pytables store information. have:

  • 62 classes (a-z,a-z,0-9)
  • each class has between 100 , 600 tables
  • each table has 1 single column store 32bit float
  • each column has between 2^2 , 2^8 rows (depending on parameters)

    my problem after train classifier, takes lot of time read information in test. example: 1 database has 27900 tables (62 classes * 450 tables per class) , there 4 rows per table , took aprox 4hs read , retrieve information need. test program read each table 390 times (for classes a-z, a-z) , 150 times classes 0-9 info need. normal? tried use index option unique column , dont see performance. work on virtualmachine 2gb ram on hp pavillion dv6 (4gb ram ddr3, core2 duo).

this because column lookup on tables 1 of slower operations can , of information lives. have 2 basic options increase performance tables many columns , few rows:

  1. pivot structure such have table many rows , few columns.

  2. move more efficient data structure carray or earray every row / column.

additionally, can try using compression speed things up. sort of generic advice, because haven't included code.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

html - Unable to style the color of bullets in a list -

c# - must be a non-abstract type with a public parameterless constructor in redis -