Finally moving away from the ActiveRecord pattern for some of my 'models' in Favstar.
2012-05-18 03:30:14Removing one little integer column seems to have reduced the time to import this table into mysql from 12+ hours to 15 minutes.
2012-05-18 03:53:56@timhaines dude you should be using Mongo not SQL. What are you THINKING?!
2012-05-18 03:54:45That was a non-PK auto incrementing integer column (which requires an index) on a 3 column table with a composite primary key.
2012-05-18 03:55:29@aviel Dude, I've just got out of a bad relationship with Riak. I don't want to get into one with Mongo.
2012-05-18 03:56:24@dacort This table structure is how I model favs, retweets, and friendships. Bastardized structure to fit it into Rails's ActiveRecord reqs.
2012-05-18 03:57:57@knowncitizen something like that. Beat your head with your mistakes until you learn?
2012-05-18 04:04:13@timhaines in that, moved away and created a composite primary key, you mean?
2012-05-18 04:12:36@kenn No - it always had a composite primary key. I just removed the (now) redundant non-PK auto-inc column I had.
2012-05-18 04:13:57@timhaines heh. if you're on mysql 5.1+ and innodb you can take advantage of fast index creation, import without index and built it later.
2012-05-18 04:15:01@kenn I read about that briefly this morning. It's for non-PK indexes right? I don't think I'll need it now.
2012-05-18 04:16:08@timhaines ah yes, you're right, it's secondary indexes only. for innodb, be careful tho: compound primary key = large clustered index.
2012-05-18 04:17:33@kenn Yeah - I've been using this schema for a couple of years. The tables/indexes are indeed large.
2012-05-18 04:18:49@timhaines that is, all secondary indexes have VERY large pointer to the record, which is the same size as the compound index
2012-05-18 04:18:54@timhaines yeah, actually rails' convention is not bad in that sense. innodb works best when you have autoincremented single primary key
2012-05-18 04:19:44@kenn The "insert into t_new select.." was the step that dropped the auto inc column. I wonder why there's a difference in the 13G and 7G.
2012-05-18 04:22:47@timhaines that's another thing: fragmentation, i assume. you can have the same effect by empty ALTER TABLE if you use innodb_file_per_table
2012-05-18 04:23:28@timhaines did you check data/index ratio? SELECT table_name, table_rows, data_length, index_length FROM information_schema.tables
2012-05-18 04:26:44@kenn the table_rows column returns a vastly different number each time I run it.
2012-05-18 04:28:59@timhaines sure, it's just an educated guess by the engine. its histogram is used by the cost-based optimizer and don't need to be accurate
2012-05-18 04:29:47@kenn I would have expected running the "insert into select.." would have had the same result as the dump/reimport.
2012-05-18 04:30:19@kenn It jumped from 300m rows to 230m rows between queries. Very inaccurate.
2012-05-18 04:31:24