6.6 File Input/Output
Documentation
Julia 1.0 Programming: Chapter 8 (I/O, Networking, and Parallel Computing)
Introducing Julia Wikibook: Working with text files
Julia Base: I/O and Network
Theory: Reading and Writing Files
Note: Using Julia Base functions only
UCI Machine Learning Repository: Adult Data Set
Tabulate and report counts for sex in Adult Data Set
Dataset (adult.data)
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K
Input (process_file.jl)
# process_file.jl
# Tabulate and report counts for sex in Adult Data Set
# https://archive.ics.uci.edu/ml/datasets/adult
# relative path of file
data_file = open("_data/adult/adult.data", "r")
# absolute path of file
# data_file = open("/Users/user/data/adult/adult.data", "r")
# initialize collection (dictionary for tabulating counts)
gender_dict = Dict()
# read each line, extract sex, and keep track of counts
for line in readlines(data_file)
# skip empty lines
if isempty(line)
continue
end
# split line into array, based on delimiter (comma and space)
line_array = split(line, ", ")
# tabulate the counts for gender
gender = line_array[10]
if haskey(gender_dict, gender)
gender_dict[gender] += 1
else
gender_dict[gender] = 1
end
end
# report total counts
println("Sort by key (alphabetical):")
for gender in keys(gender_dict)
println(" $gender = $(gender_dict[gender])")
end
# report total counts by key, in reverse order
println("Sort by key (reverse alphabetical):")
for gender in sort(collect(keys(gender_dict)), rev=true)
println(" $gender = $(gender_dict[gender])")
end
# report total counts by value, in reverse order (send output to file)
output_file = open("process_file_output.txt", "w")
println("Sort by value (reverse numerical):")
for (count, gender) in sort(collect(zip(values(gender_dict),keys(gender_dict))), rev=true)
println(" $gender = $(gender_dict[gender])")
write(output_file, "$gender = $count\n")
end
Output
Sort by key (alphabetical):
Female = 10771
Male = 21790
Sort by key (reverse alphabetical):
Male = 21790
Female = 10771
Sort by value (reverse numerical):
Male = 21790
Female = 10771
From BIDSS Manual (Need to Modify)
Terminal
$ julia process_file.jl
Sort by key (alphabetical):
Female = 10771
Male = 21790
Sort by key (reverse alphabetical):
Male = 21790
Female = 10771
Sort by value (reverse numerical):
Male = 21790
Female = 10771
$ ls -1
process_file.jl
process_file_output.txt
$ more process_file_output.txt
Male = 21790
Female = 10771
© Brown Center for Biomedical Informatics (BCBI) at Brown University. Last updated: November 15, 2022. Website built with Franklin.jl. Powered by the Julia programming language.