First draft done!

Yeah! I just finished the last chapter.

Screen Shot 2016-07-29 at 16.34.58

It is all still first draft chapters. I will start editing next week. They were written out of order so there is likely to be some inconsistencies and such, but the editing will take care of that.

Now is a good time to give me comments if you have any. Then I will address them in the edits the first time around. Otherwise it will wait until the second iteration of editing that will probably be after I have used the book in class. But I will give it a careful read and fix what I find in the coming weeks before the next term starts.

As before you can get the book from leanpub, but if you prefer me to email it to you just let me know.

Now that all chapters have a draft I have increased the suggested price. You can still get it for free, just set the price for what you feel like paying. I am not writing this to make any money but if I get enough from leanpub, and if I can figure out how to get it from paypal, I will spend it on getting a physical version made.

I put the suggested price at $4.99 but it adds VAT so it will vary depending on where you access the website from. In any case, the price is just a suggestion and you won’t hurt my feelings if you lower it to zero.

I have really only checked the PDF and EPub versions – and mostly only the PDF – so the Mobi version is likely to have formatting errors galore. I will check it carefully after I have gone through the first edits of all chapters.

Slouching towards completion

I have finished the chapter on visualisation. I decided not to include ggvis after all. I think later I want to write a chapter on shiny together with ggvis, but without writing about dynamic documents I don’t have much reason to include ggvis just yet.

Current status

I have four chapters yet to write and two of these at least are chapters I don’t really know how to write.

I want to write something about dealing with large data sets. I don’t mean Big Data — I think that is a completely different topic and one that is way beyond the scope of my class and this book — but I want to say a bit on how to deal with data when you run into problems with size. I mostly do that for plotting where, say, scatterplots have too many points and you need to summerise the data instead. But maybe also something about using dplyr with SQL or using data.tables instead of data.frames. I still haven’t quite figured out what to put there.

For the next class I am teaching I also need to have a chapter on a data analysis project. Previous years the data analysis project has been the main topic of the class and every student has picked a data set to see what they can get out of it. I am still going to use that for the class — I think you learn much more from analysing a new data set where you don’t know what you will find — but I want to have a chapter with an example. I need to pick a good data set I can use to illustrate the topics in the previous chapters. I am not too worried about that, though. That should be simple enough.

Then there is a chapter for the “programming” part of the book. I am thinking optimisation but that might change. I am not in a hurry to get it done, though, that class is in the second half of autumn, but I would like to have it done earlier so I can focus on editing the book during autumn.

The fourth missing chapter is just the conclusions. I want to have some pointers to other books worth reading, but I will just update that as I think of books. No worries there.

Tomorrow I need to write on the “large data” chapter. I would be really grateful for ideas on what to put into it.

More chapters finished…

The writing is going slower than I had anticipated but I still think I should manage to finish the book before the end of the month. I have drafted chapters on writing reports in R Markdown and on using supervised and unsupervised learning algorithms.

Current progress

Of course, the time estimate is based on guesstimates of word count — and I really don’t know how many words go into a chapter until I am done with it — and it isn’t really the writing that takes time as it is figuring out what to include in a chapter.

For the chapters I have written so far I had a good idea about what to write. The remaining chapters are more problematic. I have an idea for what to write in the plotting chapter — basic graphics, ggplot2 and ggvis (but I need to get familiar with ggvis for that; I have only played with it a little and not really used it much) — but for the “big data” chapter I haven’t really thought it through. The chapter on profiling and optimisation I haven’t really thought much about either but I don’t really need it for the first class I am going to teach — it won’t be used until the second half of autumn, so I am not too worried about it.

What Project 1 should be I still have no idea.

Anyway, you can download the current version at Leanpub. If you don’t want to sign up there for it, send me an email and I will send you the book.

I have set the price for the book on Leanpub as free, also suggested to be free, but two have already decided to pay for it. I consider that a bit crazy since it isn’t even a half-finished book yet, but I am grateful. I just don’t know how to deal with paypal so I don’t know what to do with the payment yet. When I figure it out I think I will use the money to figure out how to make a hardcopy of the book. I think that can be done with Lulu if I pay for a test print, so I will look into that later. For now, I have to focus on getting the remaining chapters written.

Getting my book

I got a lot of positive feedback after writing here that I am working on a book and a lot of people wants to see a few chapters so I decided to share the book in its current very early state.

I put it on lean pub just because it is a site I know. At least it is a way for me to share the book. Now they do suggest that you pay a bunch of dollars to download it. Don’t do it. You can scale the price down to zero and I would recommend that you do. At this point it isn’t worth our money and in any case I don’t know how to deal with the tax agency if I got any money from the US. Please just scale the price down to zero and we will all just be happier.

I would really love to get feedback on the book but I need to finish the chapters I have planned so I hope you understand that if you write me comments I will ignore them until I have finished the chapters I have planned. After that I will get started with editing and then your comments would be really helpful.