Data

Ken Pu
Tuesday, May 1, 2018

We will look at how data is accessed and processed in Clojure.

Principles of Data-driven Programming

Clojure is a functional language. It’s data storage is primarily write once, read only.

  1. Construction
  2. Transformation

We (almost) never modify data in-place.

Managing Data

Construction

Definition: Construction

Building a data structure from smaller pieces is known as construction.

;; A vector
["Ken" "CS" "Clojure"]

;; A hashmap
{:name "Ken"
 :group "CS"
 :likes "Clojure"}

;; A set
#{ :red :green :blue }

;; A list (note the quote)
'("Ken" "likes" "Clojure")

Destructure

Definition: Destructure

The process of extracting smaller constituents from a data structure is known as destructure.

Destructuring by access

Let’s assume that the data structures are properly bound to the symbols.

Accessing a list

(def a-list (range 10)) ;; 0, 1, 2, ... 9
(first a-list) ;; 0
(rest a-list) ;; 1, 2, ... 9
(nth a-list 2) ;; 2
(last a-list) ;; 9

Accessing a vector

(def a-vector [:a :b :c :d])

;; just treat a vector as a list
(first a-vector) ;; :a

;; can do random-access very efficiently
(get a-vector 2) ;; :c - zero-indexed

Accessing hashmap

(def a-map {:name "Ken"
            :likes ["Programming" "Clojure"]
            :office {:building "UA"
                     :room "4041"}})

;; Getting by key
(get a-map :name) ;; Ken
(get a-map :first-name) ;; nil
(get a-map :first-name :unknown) ;; :unknown

;; Get from inner maps 
(get (get a-map :office) :room) ;; "4041"
(get (get a-map :likes) 0) ;; "Programming"

;; Get from Inner maps
(get-in a-map [:office :room]) ;; "4041"
(get-in a-map [:likes 0]) ;; "Programming"

Advanced destructuring with binding

See [1] for details.

Transformation

Data modification is not supported by Clojure. So, instead of modifying existing data, we generate an incrementally altered copy.

Adding to data

Conjoin:

(conj [:a :b :c] :d) ;; [:a :b :c :d]

(conj '(:a :b :c) :d) ;; (:d :a :b :c)

(conj {:a 1, :b 2} [:c 3]) ;; {:a 1, :b 2, :c 3}

conj adds an element to a collection in the most efficient manner.

Associate:

(assoc {:a 1 :b 2} :c 3) ;; {:a 1 :b 2 :c 3}

Cons:

(cons :d [:a :b :c]) ;; (:d :a :b :c)

(cons :d '(:a :b :c)) ;; (:d :a :b :c)

cons always:

  1. adds element at the first
  2. returns a list

Updating data by replacement

Only vectors and hashmaps can be “updated”.

(assoc [1 -2 3] 1 2) ;; [1 2 3]

(assoc {:name "K"
        :likes ["Programming" "Clojure"]} :name "Ken")

(assoc-in {:name "K"
           :likes ["Programming" "Clojure"]} 
          [:likes 0] "Teach")

Removing from data from shrinkage

;; Deleting from a list
(drop 1 '(:a :b :c)) ;; (:b :c)
(pop '(:a :b :c))    ;; (:b :c)

;; Delete from vector
(drop 1 [:a :b :c])  ;; (:b :c)
(pop [:a :b :c])     ;; [:a :b]

;; Delete from hashmap
(dissoc {:name "Ken"
         :office "UA4041"} :office) ;; {:name "Ken"}

;; Delete from set
(dissoc #{ :red :blue :green :white } :white) ;; #{:red :blue :green}

See [2] for more data transformation functions.


Reference

  1. https://clojure.org/guides/destructuring
  2. https://clojure.org/api/cheatsheet
comments powered by Disqus