For a site that I’ve been working on, I’ve enabled a load up of data all in a python script. Darned handy. The magic method that does all that nice reseting bits is:
from django.core import management
from django.contrib.auth.create_superuser import createsuperuser
from django.db import connection
cursor = connection.cursor()
cursor.execute(“select tablename from pg_tables where schemaname=’public'”)
tables = cursor.fetchall()
for table in tables:
cursor.execute(‘drop table “%s” cascade’ % table)
And then I go on to run a bunch of other methods to do all that cool initial data loading goodness. Only we ran into a problem:
Traceback (most recent call last):
File “initial_data_load.py”, line 641, in ?
File “initial_data_load.py”, line 474, in create_geography_table
latitude = radians(float(lat)),)
File “/users/home/joseph/local/Django-0.95/django/db/models/manager.py”, line 73, in create
File “/users/home/joseph/local/Django-0.95/django/db/models/query.py”, line 223, in create
File “/users/home/joseph/local/Django-0.95/django/db/models/base.py”, line 203, in save
File “/users/home/joseph/local/Django-0.95/django/db/backends/util.py”, line 19, in execute
I couldn’t figure out what was happening for a while, since it was only happening on one system (and annoyingly – not my laptop…). Turns out the detail is in that last line of the traceback. We had “DEBUG=True” enabled in settings.py, and that there Geography table – well, it’s a big’un. We blew out the memory for the python process because when DEBUG=True, the database cursor does a really nice little thing – it keeps all your SQL queries for you. Which is great until you hit that per-process memory limit and your script terminates unexpectedly.
Switching the settings.py to DEBUG=False stopped keeping those queries in memory – and the memory exception went away. Yeah for the wisdom of DEBUG=False!
4 thoughts on “The wisdom of DEBUG=False when loading LOTS of data.”
Do you need to do SQL to load initial data? Can’t you use the Django ORM?
We could have used straight SQL, but we were using the Django ORM setup because it was easier. The only thing we hadn’t counted on was how many SQL statements would get made when we were “bulk loading” using it with DEBUG=True, the cursor object keeping a copy of all those statements in memory…
Oh I got confused, I saw the cursor stuff and immediately thought that you were doing sql to load the data. Stupid me…you were just using sql to get the list of tables and drop them.
Thank you so much, this is exactly the problem I had. Django ORM worked fine for importing the smaller (legacy) tables into my application, but ran out of memory on importing the 225K rows of data in my history table.
Thanks again for the solution.
Comments are closed.