Inpainting missing data

Missing data in an image can be an issue, especially when one wants to perform Fourier analysis. This tutorial explains how to fill-up missing pixels with values which looks “realistic” and introduce as little perturbation as possible for subsequent analysis. The user should keep the mask nearby and only consider the values of actual pixels and never the one inpainted.

This tutorial will use fully synthetic data to allow comparison between actual (syntetic) data with inpainted values.

The first part of the tutorial is about the generation of a challenging 2D diffraction image with realistic noise and to describe the metric used, then comes the actual tutorial on how to use the inpainting. Finally a benchmark is used based on the metric determined.

Creation of the image

A realistic challenging image should contain:

  • Bragg peak rings. We chose LaB6 as guinea-pig, with very sharp peaks, at the limit of the resolution of the detector
  • Some amorphous content
  • strong polarization effect
  • Poissonian noise

One image will be generated but then multiple ones with different noise to discriminate the effect of the noise from other effects.

In [1]:
%pylab nbagg
Populating the interactive namespace from numpy and matplotlib
In [2]:
import pyFAI
print("Using pyFAI version: ", pyFAI.version)
from pyFAI.gui import jupyter
import pyFAI.test.utilstest
from pyFAI.calibrant import get_calibrant
import time
start_time = time.time()
/mntdirect/_scisoft/users/jupyter/jupy35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using pyFAI version:  0.15.0-beta4
In [3]:
detector = pyFAI.detector_factory("Pilatus2MCdTe")
mask = detector.mask.copy()
nomask = numpy.zeros_like(mask)
detector.mask=nomask
ai = pyFAI.AzimuthalIntegrator(detector=detector)
ai.setFit2D(200, 200, 200)
ai.wavelength = 3e-11
print(ai)
Detector Pilatus CdTe 2M         PixelSize= 1.720e-04, 1.720e-04 m
Wavelength= 3.000000e-11m
SampleDetDist= 2.000000e-01m    PONI= 3.440000e-02, 3.440000e-02m       rot1=0.000000  rot2= 0.000000  rot3= 0.000000 rad
DirectBeamDist= 200.000mm       Center: x=200.000, y=200.000 pix        Tilt=0.000 deg  tiltPlanRotation= 0.000 deg
In [4]:
LaB6 = get_calibrant("LaB6")
LaB6.wavelength = ai.wavelength
print(LaB6)
r = ai.array_from_unit(unit="q_nm^-1")
decay_b = numpy.exp(-(r-50)**2/2000)
bragg = LaB6.fake_calibration_image(ai, Imax=1e4, W=1e-6) * ai.polarization(factor=1.0) * decay_b
decay_a = numpy.exp(-r/100)
amorphous = 1000*ai.polarization(factor=1.0)*ai.solidAngleArray() * decay_a
img_nomask = bragg + amorphous
#Not the same noise function for all images two images
img_nomask = numpy.random.poisson(img_nomask)
img_nomask2 = numpy.random.poisson(img_nomask)
img = numpy.random.poisson(img_nomask)
img[numpy.where(mask)] = -1
fig,ax = subplots(1,2, figsize=(10,5))
jupyter.display(img=img, label="With mask", ax=ax[0])
jupyter.display(img=img_nomask, label="Without mask", ax=ax[1])
LaB6 Calibrant at wavelength 3e-11
Data type cannot be displayed: application/javascript
Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8f84960898>

Note the aliassing effect on the displayed images.

We will measure now the effect after 1D intergeration. We do not correct for polarization on purpose to highlight the defect one wishes to whipe out. We use a R-factor to describe the quality of the 1D-integrated signal.

In [5]:
wo = ai.integrate1d(img_nomask, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
wo2 = ai.integrate1d(img_nomask2, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
wm = ai.integrate1d(img, 2000, unit="q_nm^-1", method="splitpixel", mask=mask, radial_range=(0,210))
ax = jupyter.plot1d(wm , label="with_mask")
ax.plot(*wo, label="without_mask")
ax.plot(*wo2, label="without_mask2")
ax.plot(wo.radial, wo.intensity-wm.intensity, label="delta")
ax.plot(wo.radial, wo.intensity-wo2.intensity, label="relative-error")
ax.legend()
print("Between masked and non masked image R= %s"%pyFAI.utils.mathutil.rwp(wm,wo))
print("Between two different non-masked images R'= %s"%pyFAI.utils.mathutil.rwp(wo2,wo))
Data type cannot be displayed: application/javascript
Between masked and non masked image R= 5.673393078616664
Between two different non-masked images R'= 0.24760409984292858
In [6]:
# Effect of the noise on the delta image
fig, ax = subplots()
jupyter.display(img=img_nomask-img_nomask2, label="Delta due to noise", ax=ax)
ax.figure.colorbar(ax.images[0])
Data type cannot be displayed: application/javascript
Out[6]:
<matplotlib.colorbar.Colorbar at 0x7f8f848a1860>

Inpainting

This part describes how to paint the missing pixels for having a “natural-looking image”. The delta image contains the difference with the original image

In [7]:
#Inpainting:
inpainted = ai.inpainting(img, mask=mask, poissonian=True)
fig, ax = subplots(1, 2, figsize=(12,5))
jupyter.display(img=inpainted, label="Inpainted", ax=ax[0])
jupyter.display(img=img_nomask-inpainted, label="Delta", ax=ax[1])
ax[1].figure.colorbar(ax[1].images[0])
Data type cannot be displayed: application/javascript
Out[7]:
<matplotlib.colorbar.Colorbar at 0x7f8f84827a90>
In [8]:
# Comparison of the inpained image with the original one:
wm = ai.integrate1d(inpainted, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
wo = ai.integrate1d(img_nomask, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
ax = jupyter.plot1d(wm , label="inpainted")
ax.plot(*wo, label="without_mask")
ax.plot(wo.radial, wo.intensity-wm.intensity, label="delta")
ax.legend()
print("R= %s"%pyFAI.utils.mathutil.rwp(wm,wo))
Data type cannot be displayed: application/javascript
R= 1.292829889486801

One can see by zooming in that the main effect on inpainting is a broadening of the signal in the inpainted region. This could (partially) be adressed by increasing the number of radial bins used in the inpainting.

Benchmarking and optimization of the parameters

The parameter set depends on the detector, the experiment geometry and the type of signal on the detector. Finer detail require finer slicing.

In [9]:
#Basic benchmarking of execution time for default options:
% timeit inpainted = ai.inpainting(img, mask=mask)
595 ms ± 4.93 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [10]:
wo = ai.integrate1d(img_nomask, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
for j in ("csr", "csr_nosplit"):
    for k in (512,1024,2048, 4096):
        ai.reset()
        for i in range(10):
            inpainted = ai.inpainting(img, mask=mask, poissonian=True, method=j, npt_rad=k, grow_mask=i)
            wm = ai.integrate1d(inpainted, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
            print("method: %s npt_rad=%i grow=%i; R= %s"%(j, k, i,pyFAI.utils.mathutil.rwp(wm,wo)))
method: csr npt_rad=512 grow=0; R= 3.161664269613865
method: csr npt_rad=512 grow=1; R= 2.899216323473676
method: csr npt_rad=512 grow=2; R= 2.701508750780556
method: csr npt_rad=512 grow=3; R= 2.65061213676219
method: csr npt_rad=512 grow=4; R= 2.617555180485093
method: csr npt_rad=512 grow=5; R= 2.57347618498842
method: csr npt_rad=512 grow=6; R= 2.5412272779190777
method: csr npt_rad=512 grow=7; R= 2.5424781610950076
method: csr npt_rad=512 grow=8; R= 2.535123698786027
method: csr npt_rad=512 grow=9; R= 2.5269052053507344
method: csr npt_rad=1024 grow=0; R= 1.6838218544572787
method: csr npt_rad=1024 grow=1; R= 1.3149897216792887
method: csr npt_rad=1024 grow=2; R= 1.3116298971916918
method: csr npt_rad=1024 grow=3; R= 1.2896416017932026
method: csr npt_rad=1024 grow=4; R= 1.2747399769865193
method: csr npt_rad=1024 grow=5; R= 1.2724794383058406
method: csr npt_rad=1024 grow=6; R= 1.2695832434720735
method: csr npt_rad=1024 grow=7; R= 1.2654003153510502
method: csr npt_rad=1024 grow=8; R= 1.2663188006453072
method: csr npt_rad=1024 grow=9; R= 1.2626226616691933
method: csr npt_rad=2048 grow=0; R= 0.8900897902054383
method: csr npt_rad=2048 grow=1; R= 0.6415733019083507
method: csr npt_rad=2048 grow=2; R= 0.6279947649332732
method: csr npt_rad=2048 grow=3; R= 0.6251265076782447
method: csr npt_rad=2048 grow=4; R= 0.6278123553337961
method: csr npt_rad=2048 grow=5; R= 0.6259227104725384
method: csr npt_rad=2048 grow=6; R= 0.6278659029820132
method: csr npt_rad=2048 grow=7; R= 0.6254215042148916
method: csr npt_rad=2048 grow=8; R= 0.625004425968083
method: csr npt_rad=2048 grow=9; R= 0.6295803386067526
method: csr npt_rad=4096 grow=0; R= 0.5604588897893655
method: csr npt_rad=4096 grow=1; R= 0.41266216393941213
method: csr npt_rad=4096 grow=2; R= 0.4164994702238333
method: csr npt_rad=4096 grow=3; R= 0.4076499027237503
method: csr npt_rad=4096 grow=4; R= 0.4063862172725736
method: csr npt_rad=4096 grow=5; R= 0.4050703629863828
method: csr npt_rad=4096 grow=6; R= 0.4025256608487438
method: csr npt_rad=4096 grow=7; R= 0.40994754826932805
method: csr npt_rad=4096 grow=8; R= 0.4155232320302788
WARNING:pyFAI.ext.splitBBoxCSR:Pixel splitting desactivated !
method: csr npt_rad=4096 grow=9; R= 0.41143678232251757
method: csr_nosplit npt_rad=512 grow=0; R= 3.8595865806789607
method: csr_nosplit npt_rad=512 grow=1; R= 2.9476707678276144
method: csr_nosplit npt_rad=512 grow=2; R= 2.67526847683543
method: csr_nosplit npt_rad=512 grow=3; R= 2.6080494256235225
method: csr_nosplit npt_rad=512 grow=4; R= 2.5790339416013826
method: csr_nosplit npt_rad=512 grow=5; R= 2.605540960437123
method: csr_nosplit npt_rad=512 grow=6; R= 2.5785552902866615
method: csr_nosplit npt_rad=512 grow=7; R= 2.538169720300074
method: csr_nosplit npt_rad=512 grow=8; R= 2.5224413279703737
WARNING:pyFAI.ext.splitBBoxCSR:Pixel splitting desactivated !
method: csr_nosplit npt_rad=512 grow=9; R= 2.523061878481299
method: csr_nosplit npt_rad=1024 grow=0; R= 2.766934446140325
method: csr_nosplit npt_rad=1024 grow=1; R= 1.4564106728759716
method: csr_nosplit npt_rad=1024 grow=2; R= 1.3101223014928907
method: csr_nosplit npt_rad=1024 grow=3; R= 1.2790403169316522
method: csr_nosplit npt_rad=1024 grow=4; R= 1.2561718597936133
method: csr_nosplit npt_rad=1024 grow=5; R= 1.2855850582333272
method: csr_nosplit npt_rad=1024 grow=6; R= 1.1674261474849386
method: csr_nosplit npt_rad=1024 grow=7; R= 1.1423541811585882
method: csr_nosplit npt_rad=1024 grow=8; R= 1.1493437401481235
WARNING:pyFAI.ext.splitBBoxCSR:Pixel splitting desactivated !
method: csr_nosplit npt_rad=1024 grow=9; R= 1.1578271623584184
method: csr_nosplit npt_rad=2048 grow=0; R= 2.8079162385974765
method: csr_nosplit npt_rad=2048 grow=1; R= 1.2537733096951353
method: csr_nosplit npt_rad=2048 grow=2; R= 1.147051051470379
method: csr_nosplit npt_rad=2048 grow=3; R= 1.0743694280501137
method: csr_nosplit npt_rad=2048 grow=4; R= 0.9905178466100836
method: csr_nosplit npt_rad=2048 grow=5; R= 0.8931651908640255
method: csr_nosplit npt_rad=2048 grow=6; R= 0.8393120612647744
method: csr_nosplit npt_rad=2048 grow=7; R= 0.7861280951027778
method: csr_nosplit npt_rad=2048 grow=8; R= 0.776802155719979
WARNING:pyFAI.ext.splitBBoxCSR:Pixel splitting desactivated !
method: csr_nosplit npt_rad=2048 grow=9; R= 0.7514325674692471
method: csr_nosplit npt_rad=4096 grow=0; R= 2.7020869303454393
method: csr_nosplit npt_rad=4096 grow=1; R= 1.2136152541782632
method: csr_nosplit npt_rad=4096 grow=2; R= 1.2045245374268114
method: csr_nosplit npt_rad=4096 grow=3; R= 1.1831685518676884
method: csr_nosplit npt_rad=4096 grow=4; R= 1.150484641645269
method: csr_nosplit npt_rad=4096 grow=5; R= 1.1153935156128965
method: csr_nosplit npt_rad=4096 grow=6; R= 1.0772705131115292
method: csr_nosplit npt_rad=4096 grow=7; R= 1.063815528794621
method: csr_nosplit npt_rad=4096 grow=8; R= 1.0102320997994851
method: csr_nosplit npt_rad=4096 grow=9; R= 0.9763651269864748
In [11]:
#Inpainting, best solution found:
ai.reset()
%time inpainted = ai.inpainting(img, mask=mask, poissonian=True, method="csr", npt_rad=4096, grow_mask=5)
fig, ax = subplots(1, 2, figsize=(12,5))
jupyter.display(img=inpainted, label="Inpainted", ax=ax[0])
jupyter.display(img=img_nomask-inpainted, label="Delta", ax=ax[1])
ax[1].figure.colorbar(ax[1].images[0])
CPU times: user 3.89 s, sys: 348 ms, total: 4.24 s
Wall time: 936 ms
Data type cannot be displayed: application/javascript
Out[11]:
<matplotlib.colorbar.Colorbar at 0x7f8f7f798048>
In [12]:
# Comparison of the inpained image with the original one:
wm = ai.integrate1d(inpainted, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
wo = ai.integrate1d(img_nomask, 2000, unit="q_nm^-1", method="splitpixel", radial_range=(0,210))
ax = jupyter.plot1d(wm , label="inpainted")
ax.plot(*wo, label="without_mask")
ax.plot(wo.radial, wo.intensity-wm.intensity, label="delta")
ax.legend()
print("R= %s"%pyFAI.utils.mathutil.rwp(wm,wo))
Data type cannot be displayed: application/javascript
R= 0.4024282744203417

Conclusion

Inpainting is one of the only solution to fill up the gaps in detector when Fourier analysis is needed. This tutorial explains basically how this is possible using the pyFAI library and how to optimize the parameter set for inpainting. The result may greatly vary with detector position and tilt and the kind of signal (amorphous or more spotty).

In [13]:
print("Execution time: %.3fs"%(time.time()-start_time))
Execution time: 63.118s