# File Input/Output

Many Julia programs involve the input and output of files. When analyzing a dataset, that dataset file will need to be pulled into your program (input). If you want to see the results of your analysis, your program will need an output.&#x20;

This section provides the syntax for inputing files (reading) and outputting results (writing) use base Julia (i.e., no packages such as CSV.jl).

### UC Irvine Machine Learning Repository: Adult Data Set

* Tabulate and report counts for sex in [Adult Data Set](https://archive.ics.uci.edu/dataset/2/adult) from the [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/).

***Dataset*** (example lines from `adult.data`)&#x20;

```julia
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K
```

***Input*** (`process_file.jl`)

```julia
# process_file.jl
# Tabulate and report counts for sex in Adult Data Set
# https://archive.ics.uci.edu/ml/datasets/adult

# relative path of file
data_file = open("_data/adult/adult.data", "r")

# absolute path of file
# data_file = open("/Users/user/data/adult/adult.data", "r")

# initialize collection (dictionary for tabulating counts)
gender_dict = Dict()

# read each line, extract sex, and keep track of counts
for line in readlines(data_file)

  # skip empty lines
  if isempty(line)
      continue
   end

  # split line into array, based on delimiter (comma and space)
  line_array = split(line, ", ")

  # tabulate the counts for gender
  gender = line_array[10]
  if haskey(gender_dict, gender)
    gender_dict[gender] += 1
  else
    gender_dict[gender] = 1
  end
end

# report total counts
println("Sort by key (alphabetical):")
for gender in keys(gender_dict)
  println("  $gender = $(gender_dict[gender])")
end

# report total counts by key, in reverse order
println("Sort by key (reverse alphabetical):")
for gender in sort(collect(keys(gender_dict)), rev=true)
  println("  $gender = $(gender_dict[gender])")
end

# report total counts by value, in reverse order (send output to file)
output_file = open("process_file_output.txt", "w")
println("Sort by value (reverse numerical):")
for (count, gender) in sort(collect(zip(values(gender_dict),keys(gender_dict))), rev=true)
  println("  $gender = $(gender_dict[gender])")
  write(output_file, "$gender = $count\n")
end
```

***Output***

```julia
Sort by key (alphabetical):
  Female = 10771
  Male = 21790
Sort by key (reverse alphabetical):
  Male = 21790
  Female = 10771
Sort by value (reverse numerical):
  Male = 21790
  Female = 10771
```

***Terminal***

```julia
$ julia process_file.jl
Sort by key (alphabetical):
  Female = 10771
  Male = 21790
Sort by key (reverse alphabetical):
  Male = 21790
  Female = 10771
Sort by value (reverse numerical):
  Male = 21790
  Female = 10771

$ ls -1
process_file.jl
process_file_output.txt

$ more process_file_output.txt
Male = 21790
Female = 10771
```

## Exercises <a href="#documentation" id="documentation"></a>

* Analyze the MIMIC-IV Demo Files Using Julia - *<mark style="color:yellow;">Forthcoming!</mark>*
* Analyze the SyntheticRI Demo Files Using Julia - *<mark style="color:yellow;">Forthcoming!</mark>*

## Resources <a href="#documentation" id="documentation"></a>

* Julia Documentation: [Base - I/O and Network](https://docs.julialang.org/en/v1/base/io-network/)
* Think Julia: [Chapter 14 - Files](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html#chap14)
