# Collections and Data Structures

> In computer programming, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. [\[1\]](#references)&#x20;

This page provides syntax for different types of collections and data structures in Julia (arrays, sets, dictionaries, etc.). Each section includes an example to demonstrate the described methods.

## Arrays <a href="#arrays" id="arrays"></a>

Arrays are ordered collection of elements. In `Julia` they are automatically indexed (consecutively numbered) by an integer starting with 1.

### Creating arrays <a href="#creating_arrays" id="creating_arrays"></a>

| Action                 | Syntax               |
| ---------------------- | -------------------- |
| New array (empty)      | `[]`                 |
| Specify type (integer) | `Int64[]`            |
| Specify type (string)  | `String[]`           |
| Array with values      | `[1, 2, 3, 4, 5]`    |
| Array with values      | `["a1", "b2", "c3"]` |
| Array of numbers       | `collect(1:10)`      |

### Creating array from string <a href="#creating_array_from_string" id="creating_array_from_string"></a>

| Action                                                   | Syntax            |
| -------------------------------------------------------- | ----------------- |
| Split string `str` by delimiter into words (e.g., space) | `split(str, " ")` |

### Accessing elements <a href="#accessing_elements" id="accessing_elements"></a>

| Action                                     | Syntax              |
| ------------------------------------------ | ------------------- |
| Get length of array my\_array              | `length(my_array)`  |
| Get first element of array my\_array       | `my_array[1]`       |
| Get last element of array my\_array        | `my_array[end]`     |
| Get n element of array my\_array (e.g., 2) | `my_array[2]`       |
| Check if element is in array               | `in(str, my_array)` |

### Adding and removing elements <a href="#adding_and_removing_elements" id="adding_and_removing_elements"></a>

| Action                        | Syntax                      |
| ----------------------------- | --------------------------- |
| Add element to end            | `push!(my_array, str)`      |
| Remove element from end       | `pop!(my_array)`            |
| Remove element from beginning | `popfirst!(my_array)`       |
| Add element to beginning      | `pushfirst!(my_array, str)` |

### Sort and unique <a href="#sort_and_unique" id="sort_and_unique"></a>

| Action                                    | Syntax             |
| ----------------------------------------- | ------------------ |
| Sort array (will not change array itself) | `sort(my_array)`   |
| Sort array in place (will change array)   | `sort!(my_array)`  |
| Get unique elements in array              | `unique(my_array)` |

### Compare arrays <a href="#compare_arrays" id="compare_arrays"></a>

| Action       | Syntax                            |
| ------------ | --------------------------------- |
| Intersection | `intersect(my_array, your_array)` |
| Union        | `union(my_array, you_array)`      |

### Convert array to string <a href="#convert_array_to_string" id="convert_array_to_string"></a>

| Action                  | Syntax                         |
| ----------------------- | ------------------------------ |
| Convert array to string | `join(collect(my_array), str)` |

Input:

```julia
# arrays.jl

day_array = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
day = "Thursday"

array_length = length(day_array)
array_first_day = day_array[1]
array_last_day = day_array[end]

println("Length of array: $array_length")
println("First day of week: $array_first_day")
println("Third day of week: $(day_array[3])")
println("Last day of week: $array_last_day")

println("$day is in $day_array: $(in(day, day_array))")

# add Sunday to beginning and Saturday to end
pushfirst!(day_array, "Sunday")
push!(day_array, "Saturday")

# print each element of array
println("Day of week: ")
for i in 1:length(day_array)
    println("  $(day_array[i])")
end

println("Day of the week: $(join(collect(day_array), ";"))")

# sort the array and print again
sort!(day_array)
println("Day of the week (sorted): $(join(collect(day_array), ";"))")
```

Output:

```julia
Length of array: 5
First day of week: Monday
Third day of week: Wednesday
Last day of week: Friday
Thursday is in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]: true
Day of week: 
  Sunday
  Monday
  Tuesday
  Wednesday
  Thursday
  Friday
  Saturday
Day of the week: Sunday;Monday;Tuesday;Wednesday;Thursday;Friday;Saturday
Day of the week (sorted): Friday;Monday;Saturday;Sunday;Thursday;Tuesday;Wednesday
```

## Sets <a href="#sets" id="sets"></a>

Sets are an unordered collection of unique elements.

### Creating sets <a href="#creating_sets" id="creating_sets"></a>

| Action          | Syntax                          |
| --------------- | ------------------------------- |
| New set (empty) | `Set[]`                         |
| Specify type    | `Set{Int64}`                    |
| Set with values | `Set([1, 2, 3, 4, 5])`          |
| Set with values | `Set(["a1", "b2", "c3", "b2"])` |

### Interacting with sets <a href="#interacting_with_sets" id="interacting_with_sets"></a>

| Action                    | Syntax               |
| ------------------------- | -------------------- |
| Get length of set my\_set | `length(my_set)`     |
| Check if value is in set  | `in(str, my_set)`    |
| Add value                 | `push!(my_set, str)` |

### Comparing sets <a href="#comparing_sets" id="comparing_sets"></a>

| Action       | Syntax                        |
| ------------ | ----------------------------- |
| Intersection | `intersect(my_set, your_set)` |
| Union        | `union(my_set, your_set)`     |
| Difference   | `setdiff(my_set, your_set)`   |

Input:

```julia
# sets.jl

color_set = Set(["red", "yellow", "blue"])
color_set2 = Set(["red", "orange", "yellow"])

println("Length	of set:	$(length(color_set))")

println("Color Set 1")
for color in color_set
    println("  $(color)")
end

println("Color Set 2: $(join(collect(color_set2), "---"))")

println("Intersection: $(intersect(color_set, color_set2))")
println("Union: $(union(color_set, color_set2))")
println("Difference: $(setdiff(color_set, color_set2))")
println("Difference: $(setdiff(color_set2, color_set))")
```

Output:

```julia
Length	of set:	3
Color Set 1
  yellow
  blue
  red
Color Set 2: yellow---orange---red
Intersection: Set(["yellow", "red"])
Union: Set(["yellow", "orange", "blue", "red"])
Difference: Set(["blue"])
Difference: Set(["orange"])
```

## Dictionaries <a href="#dictionaries" id="dictionaries"></a>

Dictionaries are unordered collection of key-value pairs where the key serves as the index (“associative collection”). Similar to elements of a set, keys are *always* unique.

### Creating dictionaries <a href="#creating_dictionaries" id="creating_dictionaries"></a>

| Action                 | Syntax                                                     |
| ---------------------- | ---------------------------------------------------------- |
| New dictionary (empty) | `Dict[]`                                                   |
| Specify type           | `Dict{String, Int64}`                                      |
| Dictionary with values | `Dict("one" => 1 , "two" => 2, "three" => 3, "four" => 4)` |

### Accessing dictionaries <a href="#accessing_dictionaries" id="accessing_dictionaries"></a>

| Action                                   | Syntax                                                |
| ---------------------------------------- | ----------------------------------------------------- |
| Get value for key in dictionary my\_dict | `my_dict["one"]`                                      |
| Check if dictionary has key              | `haskey(my_dict, "one")`                              |
| Check for key/value pair                 | `in(("one" => 1), my_dict)`                           |
| Get value and set default                | `get!(my_dict, "one", 5)<br>get!(my_dict, "five", 5)` |
| Add key/value pair                       | `my_dict["five"] = 5`                                 |
| Delete key/value pair                    | `delete!(my_dict, "four")`                            |
| Get keys                                 | `keys(my_dict)`                                       |
| Get values                               | `values(dict)`                                        |

### Converting dictionaries <a href="#converting_dictionaries" id="converting_dictionaries"></a>

| Action                  | Syntax                     |
| ----------------------- | -------------------------- |
| Convert keys to array   | `collect(keys(my_dict))`   |
| Convert values to array | `collect(values(my_dict))` |

### Sorting dictionaries <a href="#sorting_dictionaries" id="sorting_dictionaries"></a>

| Action                               | Syntax                                                              |
| ------------------------------------ | ------------------------------------------------------------------- |
| Sorting keys                         | `sort(collect(keys(my_dict)))`                                      |
| Sorting values                       | `sort(collect(values(my_dict)))`                                    |
| Sort by value (descending) with keys | `sort(collect(zip(values(my_dict), keys(my_dict))), rev=true)`      |
| Sort by value (ascending) with keys  | `sort(collect(zip(values(my_dict), keys(my_dict))), rev=false)`     |
| Get top n by value (e.g., 3)         | `sort(collect(zip(values(my_dict), keys(my_dict))), rev=true)[1:3]` |

Input:

```julia
# dicts.jl

day_dict = Dict()
day_length_dict = Dict()

day_dict["Mon"] = "Monday"
day_dict["Tue"] = "Tuesday"
day_dict["Wed"] = "Wednesday"
day_dict["Thu"] = "Thursday"
day_dict["Fri"] = "Friday"

if haskey(day_dict, "Wed")
   println("$(day_dict["Wed"])")
end

if !haskey(day_dict, "Sat")
   println("no key \"Sat\"")
end

println("print key-value pairs")
for day in keys(day_dict)
    println("  $day = $(day_dict[day])")
end

println("print values (sorted)")
for day_value in sort(collect(values(day_dict)))
    println("  $day_value")
end

# get length of each value and keep track of lengths
for day_value in values(day_dict)
    day_length = length(day_value)
    day_length_dict[day_value] = day_length
end

println("print lengths")
for day in keys(day_length_dict)
    println("  $day = $(day_length_dict[day])")
end

println("print lengths in descending order")
for (day, length) in sort(collect(zip(values(day_length_dict), keys(day_length_dict))), rev=true)
    println("  $day = $length")
end

println("print lengths in ascending order")
for (day, length) in sort(collect(zip(values(day_length_dict), keys(day_length_dict))), rev=false)
    println("  $day = $length")
end
```

Output:

```julia
Wednesday
no key "Sat"
print key-value pairs
  Wed = Wednesday
  Tue = Tuesday
  Thu = Thursday
  Mon = Monday
  Fri = Friday
print values (sorted)
  Friday
  Monday
  Thursday
  Tuesday
  Wednesday
print lengths
  Friday = 6
  Tuesday = 7
  Thursday = 8
  Wednesday = 9
  Monday = 6
print lengths in descending order
  9 = Wednesday
  8 = Thursday
  7 = Tuesday
  6 = Monday
  6 = Friday
print lengths in ascending order
  6 = Friday
  6 = Monday
  7 = Tuesday
  8 = Thursday
  9 = Wednesday
```

## References

1. Wikipedia contributors (n.d.). Collection. In Wikipedia. Retrieved May 1, 2024, from <https://en.wikipedia.org/wiki/Collection_(abstract_data_type)>

## Resources <a href="#documentation" id="documentation"></a>

* Julia Documentation: [Base - Collections and Data Structures](https://docs.julialang.org/en/v1/base/collections/)
* Think Julia: [Chapter 10 - Arrays](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html#chap10)
* Think Julia:[ Chapter 11 - Dictionaries](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html#chap11)
* Think Julia: [Chapter 12 - Tuples](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html#chap12)

##
