Set up SPyNet according to the image size and model. For optimal performance, resize your image such that width and height are a multiple of 32. You can also specify your favorite model. The present supported modes are fine tuned models spynet = require('spynet') easyComputeFlow = spynet.easy_setup()1(default), spynet = require('spynet') easyComputeFlow = spynet.easy_setup()2, spynet = require('spynet') easyComputeFlow = spynet.easy_setup()3, and base models spynet = require('spynet') easyComputeFlow = spynet.easy_setup()4 and spynet = require('spynet') easyComputeFlow = spynet.easy_setup()5.
spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images
Now you can call computeFlow anytime to estimate optical flow between image pairs.
Computing flow
Load an image pair and stack and normalize it.
im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)
SPyNet works with batches of data on CUDA. So, compute flow using
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)
You can also use batch-mode, if your images spynet = require('spynet') easyComputeFlow = spynet.easy_setup()6 are a tensor of size spynet = require('spynet') easyComputeFlow = spynet.easy_setup()7, of batch size B with 6 RGB pair channels. You can directly use:
flow = computeFlow(im)
Training
Training sequentially is faster than training end-to-end since you need to learn small number of parameters at each level. To train a level spynet = require('spynet') easyComputeFlow = spynet.easy_setup()8, we need the trained models at levels spynet = require('spynet') easyComputeFlow = spynet.easy_setup()9 to im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)0. You also initialize the model with a pretrained model at im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)0.
E.g. To train level 3, we need trained models at im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)2 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)3, and we initialize it im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)4.
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t7
End2End SPyNet
The end-to-end version of SPyNet is easily trainable and is available at anuragranj/end2end-spynet.
Optical Flow Utilities
We provide im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)5 containing various functions to make your life easier with optical flow while using Torch/Lua. You can just copy this file into your project directory and use if off the shelf.
flowX = require 'flowExtensions'
[flow_magnitude] flowX.computeNorm(flow_x, flow_y)
Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, evaluate im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)9 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8.
[flow_angle] flowX.computeAngle(flow_x, flow_y)
Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, evaluate spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images4 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 in degrees.
[rgb] flowX.field2rgb(flow_magnitude, flow_angle, [max], [legend])
Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)9 and spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images4 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, return an image of size spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images9 for visualizing optical flow. im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)0(optional) specifies maximum flow magnitude and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)1(optional) is boolean that prints a legend on the image.
[rgb] flowX.xy2rgb(flow_x, flow_y, [max])
Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, return an image of size spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images9 for visualizing optical flow. im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)0(optional) specifies maximum flow magnitude.
[flow] flowX.loadFLO(filename)
Reads a im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)7 file. Loads im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of optical flow in a 2 channel im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 optical flow field. First channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 component and second channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 component.
flowX.writeFLO(filename,F)
Write a im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 flow field im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)4 containing im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of its flow fields in its first and second channel respectively to im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)7, a im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)7 file.
[flow] flowX.loadPFM(filename)
Reads a im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)9 file. Loads im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of optical flow in a 2 channel im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 optical flow field. First channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 component and second channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 component.
[flow_rotated] flowX.rotate(flow, angle)
Rotates flow = computeFlow(im)5 of size im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 by flow = computeFlow(im)7 in radians. Uses nearest-neighbor interpolation to avoid blurring at boundaries.
[flow_scaled] flowX.scale(flow, sc, [opt])
Scales flow = computeFlow(im)5 of size im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 by th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t70 times. th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t71(optional) specifies interpolation method, th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t72 (default), th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t73, and th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t74.
[flowBatch_scaled] flowX.scaleBatch(flowBatch, sc)
Scales th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t75 of size th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t76, a batch of th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t77 flow fields by th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t70 times. Uses nearest-neighbor interpolation.
Timing Benchmarks
Our timing benchmark is set up on Flying chair dataset. To test it, you need to download
wget //lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs.zip
Run the timing benchmark
spynet = require('spynet') easyComputeFlow = spynet.easy_setup()0
References
- Our warping code is based on qassemoquab/stnbhwd.
- The images in th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t79 are from Flying Chairs dataset: Dosovitskiy, Alexey, et al. "Flownet: Learning optical flow with convolutional networks." 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015.
- Some parts of im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)5 are adapted from marcoscoffier/optical-flow with help from fguney.
- The unofficial PyTorch implementation is from sniklaus.
License
Free for non-commercial and scientific research purposes. For commercial use, please contact ps-license@tue.mpg.de. Check LICENSE file for details.
When using this code, please cite
Ranjan, Anurag, and Michael J. Black. "Optical Flow Estimation using a Spatial Pyramid Network." arXiv preprint arXiv:1611.00850 (2016).