Clojure for Data Science
上QQ阅读APP看书,第一时间看更新

Load and inspect the data

In the previous chapter, we used Incanter to load Excel spreadsheets with the incanter.excel/load-xls function. In this chapter, we will load a dataset from a tab-separated text file. For this, we'll make use of incanter.io/read-dataset that expects to receive either a URL object or a file path represented as a string.

The file has been helpfully reformatted by AcmeContent's web team to contain just two columns—the date of the request and the dwell time in seconds. There are column headings in the first row, so we pass :header true to read-dataset:

(defn load-data [file]
  (-> (io/resource file)
      (iio/read-dataset :header true :delim \tab)))

(defn ex-2-1 []
  (-> (load-data "dwell-times.tsv")
      (i/view)))

If you run this code (either in the REPL or on the command line with lein run –e 2.1), you should see an output similar to the following:

Load and inspect the data

Let's see what the dwell times look like as a histogram.