March 4, 2010

web2py Bayesian classifier and databases

Filed under: Uncategorized — mdipierro @ 3:05 pm

We have all run into the problem of pre-populating a database for the purpose of debugging or demoing a program. The problem is complicated by references in database tables and different field types.

web2py provides a solution for this problem and includes a minimalist Bayesian classifier trained and adapted to the scope.

Here is an example of usage:


Start the web2py interactive shell:

$ python web2py.py -S welcome -M

Define some tables, for example two related to each other (people and their comments):

>>> db.define_table('person',Field('name'))

>>> db.define_table('comment',Field('author',db.person),Field('body'))

Import the “populate” function

>>> from gluon.contrib.populate import populate

Ask it to populate the database tables with 100 records each

>>> populate(db.person,100)
>>> populate(db.comment,100)

See what you got

>>> for comment in db(db.comment.id>0).select(limitby=(0,3)):
        print comment.author.name, comment.body

Cocomoto SAYS Rischgitz collection. j. In connection with large that they become.
Saducece SAYS Circumvent the latest measurements at least 600 a dull white except.
Popotadu SAYS Frequenting forests of specialized member. But a list of view.

If you like it, commit your changes.

>>> db.commit()

The generate function is powerful enough to understand different field types (‘string’, ‘text’, ‘integer’, ‘date’, ‘reference’, etc.) their validation constraints and populate them accordingly. It is also very fast.


Sometimes you may want to train the Bayesian classifier with your own text and generate new text based on that. Here is how:

Import the Learner

>>> from contrib.populate import Learner

Get some text, for example Alice in Wornderland

>>> import urllib
>>> text = urllib.urlopen('http://www.gutenberg.org/files/11/11.txt').read()

Have the learner learn the text:

>>> learner = Learner()
>>> learner.learn(text)

Ask the learner to generate new text (1000 words) that “sounds similar” to the learned text.

>>> print learner.generate(1000)

Be. it further. so very angrily. it something; and took me there goes like a few minutes and she went to anyone providing access to repeat tis so bill thought alice and the footman in addition to drop the mouse come here. and she could hear you. and peeped into the use in the blame on slates and be of the paper has agreed to the hookah into the question is a table all access to do practically anything you see: the centre of the accident of a little glass table half the rest of receiving it gave her arm for you shouldn t remember half shut. this and was not be two.


He came between them at the door began talking such things to give your tongue said the hatter with seaography: that said was close to watch.

Of course the text does not make any sense. That is a feature.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: