Julia
Table of Contents
- 1. Random tricks
- 2. Julia Lang
- 3. Standard library
- 4. Tools
- 5. Third-party Libraries
1 Random tricks
collect(Set(1,2))
returns Array
1.1 Install library behind proxy
export HTTP_PROXY=127.0.0.1:8888 export HTTPS_PROXY=127.0.0.1:8888 julia
2 Julia Lang
2.1 Macros
Julia macros are hygienic:
First, variables within a macro result are classified as either local or global. A variable is considered local if it is assigned to (and not declared global), declared local, or used as a function argument name. Otherwise, it is considered global.
Local variables are then renamed to be unique (using the gensym function, which generates new symbols), and global variables are resolved within the macro definition environment.
And you can escape the hygienic by esc
, e.g.
macro zerox() return esc(:(x = 0)) end
A common pattern:
for op = (:sin, :cos, :tan, :log, :exp) eval(:(Base.$op(a::MyNumber) = MyNumber($op(a.x)))) end
3 Standard library
3.1 string
interpolation:
var = 5
"$var-$(now())"
concatenation:
"hello" * "world"
findfirst(isequal('c'), "abcdc") findnext(isequal('c'), "abcdc", 3) occursin("world", "hello world") startswith(s, prefix) endswith(s, suffix) strip(s)
regular expression:
occursin(r"^\s*(?:#|$)", "# a comment") m = match(r"(a|b)(c)?(d)", "acd") m.match m.captures m.offset m.offsets replace("a", r"." => s"\g<0>1") # => "a1"
parse:
parse("123", Int)
3.2 I/O
IO descriptors
- Base.stdout
- Base.stderr
- Base.stdin
Open file:
open("myfile.txt", "w") do io write(io, "Hello world!") end
Read and write:
- write(io, x)
- flush(io)
- read(io, String): read all content
- read(io, Char): read a char
- readline(io)
- readlines(io)
- eachline(io): similar to readlines, but this is iterable
- readuntil(io, delim)
seek and position:
- position(io): return the current position
- seek(io, pos): seek to pos
To test IO functions, use io=IOBuffer("hello world")
3.3 FS
In Base.Filesystem
walking dir:
readdir("/path/to/dir") # => array of filenames and dirnames for (root, dirs, files) in walkdir(".") println("Directories in $root") for dir in dirs println(joinpath(root, dir)) # path to directories end println("Files in $root") for file in files println(joinpath(root, file)) # path to files end end isdir(path) isfile(path)
Modifying:
mkdir("/path/to/dir") mkpath("/this/is/mkdir/-p/") cp(src, dst) mv(src, dst) rm(path) touch(path) chmod() chown()
Tempdir
mktemp() # => (path, io), this is temp file mktempdir() # => path
pathname:
dirname(path) basename(path) joinpath(parts...) splitpath(path) # remove . and .. normpath(path) expanduser(path)
3.4 random numbers
basic:
rand() # N(0,1) randn() randstring('a':'z', len=8) randperm() shuffle() seed!(1234)
use a custom generator:
using Distributions
dist = MvNormal(11, 1)
rand(dist, 100)
3.5 network
download(url, localfile)
3.6 Useful functions
sortperm(v)
: Return a permutation vector I that puts v[I] in sorted order.findfirst(predicate::Function, A)
: Return the index or key of the first element of A for which predicate returns true.mapreduce(f, op, itrs...; [init])
: Apply function f to each element(s) in itrs, and then reduce the result using the binary function op
intuitive ones:
- reverse
- abs
- median
4 Tools
4.1 profiling
- manual: https://docs.julialang.org/en/v1/manual/profile/
- graphical viewer: timholy/ProfileView.jl
Profile.init(n = 10^7, delay = 0.01)
Profile.clear()
@profile foo()
Profile.print()
4.2 Using Pkg
using Pkg Pkg.add(PackageSpec(url="https://github.com/lihebi/julia-repl", rev="master"))
To develop a project:
Pkg.develop(PackageSpec(url="https://github.com/lihebi/julia-repl"))
Then view the current pkg status:
Pkg.status()
You will see:
EmacsREPL v0.1.0 [`~/.julia/dev/EmacsREPL`]
5 Third-party Libraries
5.2 web & servers
- JuliaWeb/JuliaWebAPI.jl: this is interesting, it wraps a julia function as a remote callable API.
- GenieFramework/Genie.jl: this is a MVC framework, for building web apps with sophiscated routing. It should work but probably too complex for my purpose.
- JuliaWeb/HTTP.jl: seems to be more mature.
5.3 Static compilation
JuliaLang/PackageCompiler.jl: it has two modes:
- build a sysimage, still requires juila to run, but is faster to start. When running it seems to be a regular julia session.
- app. This can be run without julia.
I'm interested in the sysimage one. Specifically, you can do this:
create_sysimage([:CuArrays, :Zygote, :Distributions, :LightGraphs, :MetaGraphs, :CSV, :Plots, :DataFrames, :HDF5, :TensorOperations], sysimage_path="myimage.so", replace_default=true)
And start julia
like this:
julia --sysimage myimage.so
The replace_default
argument, if set to true
, will replace julia's default
image, thus you don't need to specify the sysimage anymore. To restore to
default, use
restore_default_sysimage()
It seems also possible to precompile only some functions.
5.4 reference
- juliastats: https://juliastats.org/
- kmsquire/Match.jl
- JuliaStats/RDatasets.jl: Interface to the vincentarelbundock/Rdatasets
5.5 ML library
- mpastell/LIBSVM.jl: Interface to libsvm
- willtebbutt/Stheno.jl: Gaussian Process
- STOR-i/GaussianProcesses.jl: Gaussian Process
- alan-turing-institute/MLJ.jl
5.5.1 TODO cstjean/ScikitLearn.jl
5.6 Data Representation
5.7 Optimizers
- Optim.jl: optimization
- JuMP.jl: another optimizer with more solvers
- JuliaMath/IterativeSolvers.jl: CG method for solving linear equations
5.8 GPU computing
- CuArrays.jl: https://github.com/JuliaGPU/CuArrays.jl
- CUDAapi.jl: https://github.com/JuliaGPU/CUDAapi.jl
- CUDAdrv.jl: https://github.com/JuliaGPU/CUDAdrv.jl
- CUDAnative.jl: https://github.com/JuliaGPU/CUDAnative.jl
5.10 Datasets
- METADATA.jl: Used for official package registry
- Metalhead.jl: Some vision models and dataset
- JuliaIO/HDF5.jl
5.10.1 DataFrames.jl
This is actually pretty easy to use.
Construction:
df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
Colum-by-column construction:
df = DataFrame() df.A = 1:8 df.B = ["M", "F", "F", "M", "F", "M", "M", "F"]
Row-by-row construction:
df = DataFrame(A = Int[], B = String[]) push!(df, (1, "M")) push!(df, [2, "N"]) push!(df, Dict(:B => "F", :A => 3))
Data can be accessed using dot-notation. e.g. df.A
. You can also pass selector
expressions to filter data out.
df[df.A .> 1, :]
sorting:
sort(df, [:A, :B])
5.10.2 CSV.jl
Reading
CSV.read()
reads a file into data frames. The columns can be accessed using
$
syntax.
df = CSV.read(fname) df$a df$b
CSV.Rows()
returns, well, CSV.Rows
, you can access the column by dot
notation.
for row in CSV.Rows(fname) @show row.a row.b end
Writing. CSV.write(fname, table)
. The interface seems to be Tables.jl. It
could be just DataFrame
.
5.11 Images
colorview, channelview, RGB
5.12 Graph
5.12.1 LightGraphs.jl
A great package for
- just the graph
- generate different random graphs
- traversal
- plotting
- algorithms:
- shortest path
- minimum spanning tree
- distance metrics
5.12.2 MetaGraphs.jl
LightGraphs with arbitrary data on nodes.
5.12.3 Compose.jl
The racket/pict for Julia.
5.12.4 GraphLayout.jl
Alternatives:
5.13 Language & Compiler tools
- MacroTools.jl
- kmsquire/Match.jl
- SciML/RecursiveArrayTools.jl: this is to solve array of array
problem. Specifically, the splitting syntax in
cat
can cause stack overflow if the array is to large. Some discussions:- stack overflow link
- stack overflow link
- stack overflow link
- github issue
- or iterator.flatten may work
collect(Iterators.flatten([[1, 2, 3], [4, 5, 6]]))
5.14 Probablistic packages
- Distributions.jl
- GLM.jl (!!!)