This is the website of Abulsme Noibatno Itramne (also known as Sam Minter). Comments here or emails to me at are encouraged... or follow me on Twitter as @abulsme.



January 2008

Book: Data Mining: Concepts and Techniques: Second Edition

Author: Jiawei Han and Micheline Kamber
Started: 20 May 2007
Finished: 13 Jan 2008
770 p / 239 d
3 p/d

This is a book that was suggested to me as being useful for work. A textbook that would be useful for work. A long textbook for work. Anyway, I got it and started reading it when the next non-fiction slot in my rotation hit, jumping it ahead of other non-fiction books that were in line.

The first 3 chapters or so went very quickly as they were high level overviews of various things. Starting on Chapter 4 though (Data Cube Computation and Data Generalization) progress slowed down greatly. The content was a bit denser. And the reading a little less fun. Chapter 4 in particular was a subject that I could not get excited about. So I slowed down. I would have to force myself to read more. And that killed my momentum. Chapter 5 and beyond were more interesting again (at least to me) but with me out of the habit of reading regularly I actually changed my evening routine o include a “block” of reading time after my blocks for email and bills, and before my block for genealogy. Now, on an average weekday, I try for two blocks, but often only get one. It is only weekends where I usually get to blocks 3 and 4. So this meant I was only reading this book maybe once or twice a week, for 40 minutes at a shot. Now, this was still faster than I was doing without having a specific time set aside, as I would rarely just get the “Hey, I want to read the Data Mining textbook this evening before bed!” sort of feeling. So this got me going again, although still slowly.

Also of course here, I was just READING, I was not doing the exercises and problem sets in each chapter. So I definitely was not getting everything I would if I had, say, taken a class that actually used this as a textbook. But never-the-less this gave a good overview of concepts relating to Data Mining, much of which is clearly relevant to the kind of things I am responsible for at work. So this is good. Now, do I know each of the concepts back and forth deeply enough to be able to be able to give a presentation on it or explain in detail to others? At a high level maybe, but at a detail level, no. But I am more familiar with all the concepts than when I started. Enough so to know what is being talked about if the concepts come up at work and to talk about them at a high level, and to follow discussions that go deeper. And I know where to look to refresh myself on details if I need them. So that is all good.

I do wish I’d forced myself through this at a faster rate though. That was just a matter of discipline. Taking almost 8 months to read this was a bit much. When actually sitting down and reading, I was running over a page a minute. I just didn’t sit down and actually do that enough. I have to get better at that, because two more similar large non-fiction books for work are coming up soon in my queue. They will be similarly chock full of good information that will be useful for me, and often interesting stuff… but not exactly page turning reading where you just can’t wait to sit down with the book to read the next chapter. So I’ll have to work on that.

I’m thinking that rather than let the next work book take the next non-fiction slot I’ll let one of my other non-fiction books take a slot. So a fiction book, a non-work non-fiction, then another fiction, then the next work-related non-fiction. We shall see.

And when I do the next work-related non-fiction I do have to make myself go faster. Doing this one at an average three pages per day killed my reading for 2007. In 2007 I only finished reading THREE books as opposed to ELEVEN in 2006. And I thought 2006 was actually a horrible year for reading for me. Once upon a time I read much more than I read today, and that is a shame. I need to start increasing that volume. I mean, come on, at LEAST one book a month, right? One a week would even be reasonable, but I think that is out of my reach right now…

Anyway, Data Mining by Han and Kamber… if you are working on Data Mining topics for work or school and need a good overview, grab this. If you are looking for fun reading on the beach… don’t.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.