The example Training an image classifier currently uses the following code:
xs, ys = (
# convert each image into h*w*1 array of floats
[Float32.(reshape(img, 28, 28, 1)) for img in Flux.Data.MNIST.images()],
# one-hot encode the labels
[Float32.(Flux.onehot(y, 0:9)) for y in Flux.Data.MNIST.labels()],
)
However,
(Project) pkg> st Flux
Status `C:\Users\Dennis Bal\ProjectFolder\Project.toml`
[587475ba] Flux v0.13.0
julia> using Flux
julia> Flux.Data.MNIST
ERROR: UndefVarError: MNIST not defined
Stacktrace:
[1] getproperty(x::Module, f::Symbol)
@ Base .\Base.jl:35
[2] top-level scope
@ REPL[16]:1
So the example is broken. As a side note, I think the example would do great by using MLUtils instead of DataLoaders.jl
and MLDataPattern
. Also, Flux imports DataLoader so no need to explicitly import it.
But I take a look at the docs and try to get started. So I make the following code, that works with Flux's base capacities:
julia> using Flux
julia> using Flux: onehotbatch, onecold
julia> using FluxTraining
julia> using MLUtils: flatten, unsqueeze
julia> using MLDatasets
julia> labels = 0:9
0:9
julia> traindata = MNIST.traindata(Float32) |> x->(unsqueeze(x[1], 3), onehotbatch(x[2], labels));
julia> size.(traindata)
((28, 28, 1, 60000), (10, 60000))
julia> trainloader = DataLoader(traindata, batchsize=128);
julia> validdata = MNIST.testdata(Float32) |> x->(unsqueeze(x[1], 3), onehotbatch(x[2], labels));
julia> size.(validdata)
((28, 28, 1, 10000), (10, 10000))
julia> validloader = DataLoader(validdata, batchsize=128);
julia> predict = Chain(flatten, Dense(28^2, 10))
Chain(
MLUtils.flatten,
Dense(784 => 10), # 7_850 parameters
)
julia> lossfunc(x, y) = Flux.Losses.logitbinarycrossentropy(predict(x), y)
lossfunc (generic function with 1 method)
julia> optimizer=ADAM()
ADAM(0.001, (0.9, 0.999), 1.0e-8, IdDict{Any, Any}())
julia> callbacks = [Metrics(accuracy)]
1-element Vector{Metrics}:
Metrics(Loss(), Metric(Accuracy))
julia> learner = Learner(predict, lossfunc; optimizer, callbacks)
Learner()
At this point, I start checking loss and training with Flux's train!
:
julia> lossfunc(validdata...)
0.7624986f0
julia> Flux.train!(lossfunc, Flux.params(predict), trainloader, optimizer)
julia> lossfunc(validdata...)
0.11266354f0
julia> Flux.train!(lossfunc, Flux.params(predict), trainloader, optimizer)
julia> lossfunc(validdata...)
0.08880948f0
julia> Flux.train!(lossfunc, Flux.params(predict), trainloader, optimizer)
julia> lossfunc(validdata...)
0.0801171f0
Training no problem. However, when I try to train my learner, it seems like a single float is passed to predict
, and not an array:
julia> fit!(learner, 1, (traindata, validdata))
Epoch 1 TrainingPhase() ...
ERROR: MethodError: no method matching flatten(::Float32)
Closest candidates are:
flatten(::AbstractArray) at C:\Users\usrname\.julia\packages\MLUtils\QTRw7\src\utils.jl:424
Stacktrace:
[1] macro expansion
@ C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:0 [inlined]
[2] _pullback(ctx::Zygote.Context, f::typeof(flatten), args::Float32)
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:9
[3] macro expansion
@ C:\Users\usrname\.julia\packages\Flux\18YZE\src\layers\basic.jl:53 [inlined]
[4] _pullback
@ C:\Users\usrname\.julia\packages\Flux\18YZE\src\layers\basic.jl:53 [inlined]
[5] _pullback(::Zygote.Context, ::typeof(Flux.applychain), ::Tuple{typeof(flatten), Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}, ::Float32)
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:0
[6] _pullback
@ C:\Users\usrname\.julia\packages\Flux\18YZE\src\layers\basic.jl:51 [inlined]
[7] _pullback(ctx::Zygote.Context, f::Chain{Tuple{typeof(flatten), Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, args::Float32)
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:0
[8] _pullback
@ C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:54 [inlined]
[9] _pullback(ctx::Zygote.Context, f::FluxTraining.var"#70#72"{FluxTraining.var"#handlefn#78"{Learner, TrainingPhase}, FluxTraining.PropDict{Any}, Learner}, args::Chain{Tuple{typeof(flatten), Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}})
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:0
[10] _pullback
@ C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:70 [inlined]
[11] _pullback(::Zygote.Context, ::FluxTraining.var"#73#74"{FluxTraining.var"#70#72"{FluxTraining.var"#handlefn#78"{Learner, TrainingPhase}, FluxTraining.PropDict{Any}, Learner}, Chain{Tuple{typeof(flatten), Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}})
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface2.jl:0
[12] pullback(f::Function, ps::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface.jl:352
[13] gradient(f::Function, args::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
@ Zygote C:\Users\usrname\.julia\packages\Zygote\Y6SC4\src\compiler\interface.jl:75
[14] _gradient(f::FluxTraining.var"#70#72"{FluxTraining.var"#handlefn#78"{Learner, TrainingPhase}, FluxTraining.PropDict{Any}, Learner}, #unused#::ADAM, m::Chain{Tuple{typeof(flatten), Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, ps::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:70
[15] (::FluxTraining.var"#69#71"{Learner})(handle::FluxTraining.var"#handlefn#78"{Learner, TrainingPhase}, state::FluxTraining.PropDict{Any})
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:53
[16] runstep(stepfn::FluxTraining.var"#69#71"{Learner}, learner::Learner, phase::TrainingPhase, initialstate::NamedTuple{(:xs, :ys), Tuple{Float32, Float32}})
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:133
[17] step!
@ C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:51 [inlined]
[18] (::FluxTraining.var"#67#68"{Learner, TrainingPhase, Tuple{Array{Float32, 4}, Flux.OneHotArray{UInt32, 10, 1, 2, Vector{UInt32}}}})(#unused#::Function)
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:24
[19] runepoch(epochfn::FluxTraining.var"#67#68"{Learner, TrainingPhase, Tuple{Array{Float32, 4}, Flux.OneHotArray{UInt32, 10, 1, 2, Vector{UInt32}}}}, learner::Learner, phase::TrainingPhase)
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:105
[20] epoch!
@ C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:22 [inlined]
[21] fit!(learner::Learner, nepochs::Int64, ::Tuple{Tuple{Array{Float32, 4}, Flux.OneHotArray{UInt32, 10, 1, 2, Vector{UInt32}}}, Tuple{Array{Float32, 4}, Flux.OneHotArray{UInt32, 10, 1, 2, Vector{UInt32}}}})
@ FluxTraining C:\Users\usrname\.julia\packages\FluxTraining\iBFSd\src\training.jl:168
[22] top-level scope
@ REPL[51]:1
I am completely stuck as to what goes wrong. Pointers in that regard would be appreciated, but the main issue is making the example functional, and updating the packages used to load data and the utility functions that I take from MLUtils
.
To improve the reliability of this package, could doc testing be used to ensure that in the future, the documentation examples actually run?