favstarの中の人のActiveRecordパターンからの移行について

ActiveRecordパターンからの移行について。
1
Tim Haines @TimHaines

Finally moving away from the ActiveRecord pattern for some of my 'models' in Favstar.

2012-05-18 03:30:14
Tim Haines @TimHaines

Removing one little integer column seems to have reduced the time to import this table into mysql from 12+ hours to 15 minutes.

2012-05-18 03:53:56
aviel @aviel

@timhaines dude you should be using Mongo not SQL. What are you THINKING?!

2012-05-18 03:54:45
Tim Haines @TimHaines

That was a non-PK auto incrementing integer column (which requires an index) on a 3 column table with a composite primary key.

2012-05-18 03:55:29
Tim Haines @TimHaines

@aviel Dude, I've just got out of a bad relationship with Riak. I don't want to get into one with Mongo.

2012-05-18 03:56:24
Tim Haines @TimHaines

@dacort This table structure is how I model favs, retweets, and friendships. Bastardized structure to fit it into Rails's ActiveRecord reqs.

2012-05-18 03:57:57
Tim Haines @TimHaines

@knowncitizen something like that. Beat your head with your mistakes until you learn?

2012-05-18 04:04:13
Kenn Ejima @kenn

@timhaines in that, moved away and created a composite primary key, you mean?

2012-05-18 04:12:36
Tim Haines @TimHaines

@kenn No - it always had a composite primary key. I just removed the (now) redundant non-PK auto-inc column I had.

2012-05-18 04:13:57
Kenn Ejima @kenn

@timhaines heh. if you're on mysql 5.1+ and innodb you can take advantage of fast index creation, import without index and built it later.

2012-05-18 04:15:01
Tim Haines @TimHaines

@kenn I read about that briefly this morning. It's for non-PK indexes right? I don't think I'll need it now.

2012-05-18 04:16:08
Kenn Ejima @kenn

@timhaines ah yes, you're right, it's secondary indexes only. for innodb, be careful tho: compound primary key = large clustered index.

2012-05-18 04:17:33
Tim Haines @TimHaines

@kenn Yeah - I've been using this schema for a couple of years. The tables/indexes are indeed large.

2012-05-18 04:18:49
Kenn Ejima @kenn

@timhaines that is, all secondary indexes have VERY large pointer to the record, which is the same size as the compound index

2012-05-18 04:18:54
Kenn Ejima @kenn

@timhaines yeah, actually rails' convention is not bad in that sense. innodb works best when you have autoincremented single primary key

2012-05-18 04:19:44
Tim Haines @TimHaines

@kenn The "insert into t_new select.." was the step that dropped the auto inc column. I wonder why there's a difference in the 13G and 7G.

2012-05-18 04:22:47
Kenn Ejima @kenn

@timhaines that's another thing: fragmentation, i assume. you can have the same effect by empty ALTER TABLE if you use innodb_file_per_table

2012-05-18 04:23:28
Kenn Ejima @kenn

@timhaines did you check data/index ratio? SELECT table_name, table_rows, data_length, index_length FROM information_schema.tables

2012-05-18 04:26:44
Tim Haines @TimHaines

@kenn the table_rows column returns a vastly different number each time I run it.

2012-05-18 04:28:59
Kenn Ejima @kenn

@timhaines sure, it's just an educated guess by the engine. its histogram is used by the cost-based optimizer and don't need to be accurate

2012-05-18 04:29:47
Tim Haines @TimHaines

@kenn I would have expected running the "insert into select.." would have had the same result as the dump/reimport.

2012-05-18 04:30:19
Kenn Ejima @kenn

@timhaines yes, that's true and that part, I have no idea.

2012-05-18 04:30:37
Tim Haines @TimHaines

@kenn It jumped from 300m rows to 230m rows between queries. Very inaccurate.

2012-05-18 04:31:24
残りを読む(58)