VARIABLE PHOTOREALISTIC IMAGE SYNTHESIS FOR TRAINING DATASET GENERATION
Abstract and keywords
Abstract (English):
Photorealistic rendering systems have recently found new applications in artificial intelligence, specifically in computer vision for the purpose of generation of image and video sequence datasets. The problem associated with this application is producing large number of photorealistic images with high variability of 3d models and their appearance. In this work, we propose an approach based on combining existing procedural texture generation techniques and domain randomization to generate large number of highly variative digital assets during the rendering process. This eliminates the need for a large pre-existing database of digital assets (only a small set of 3d models is required), and generates objects with unique appearance during rendering stage, reducing the needed post-processing of images and storage requirements. Our approach uses procedural texturing and material substitution to rapidly produce large number of variations of digital assets. The proposed solution can be used to produce training datasets for artificial intelligence applications and can be combined with most of state-of-the-art methods of scene generation.

Keywords:
photorealistic rendering, procedural generation, synthetic datasets, computer vision
Text
Text (PDF): Read Download
References

1. Karpathy, Andrej, et al. Large-scale video classification with convolutional neural networks. // Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2014

2. Wu, Zuxuan, et al. Deep learning for video classification and captioning // Frontiers of multimedia research. 2017. 3-29.

3. Faizov B.V., Shahuro V.I., Sanzharov V.V., Konushin A.S. Klassifikaciya redkih dorozhnyh znakov // Komp'yuternaya Optika, T. 44, №2, 2020

4. Moehrmann, Julia, and Gunther Heidemann. Efficient annotation of image data sets for computer vision applications. // Proceedings of the 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications. 2012.

5. Gao, Chao, Dongguo Zhou, and Yongcai Guo. Automatic iterative algorithm for image segmentation using a modified pulse-coupled neural network. // Neurocomputing 119 (2013): 332-338.

6. Su, Hao, et al. Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. // Proceedings of the IEEE International Conference on Computer Vision. 2015.

7. Kirsanov, Pavel, et al. DISCOMAN: Dataset of Indoor Ssenes for Odometry, Mapping And Navigation. // arXiv preprint arXiv:1909.12146 (2019).

8. Nguyen, Anh, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. // Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

9. Zhang, Chiyuan, et al. Understanding deep learning requires rethinking generalization. // arXiv preprint arXiv:1611.03530 (2016).

10. Montavon, Grégoire, Wojciech Samek, and Klaus-Robert Müller. Methods for interpreting and understanding deep neural networks // Digital Signal Processing 73 (2018): 1-15.

11. Movshovitz-Attias, Yair, Takeo Kanade, and Yaser Sheikh. How useful is photo-realistic rendering for visual learning?. // European Conference on Computer Vision. Springer, Cham, 2016.

12. Tsirikoglou, Apostolia, et al. Procedural modeling and physically based rendering for synthetic data generation in automotive applications. // arXiv preprint arXiv:1710.06270 (2017).

13. Zhang, Yinda, et al. Physically-based rendering for indoor scene understanding using convolutional neural networks. // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017

14. Sanzharov V., Gorbonosov A., Frolov V., Voloboy A. Examination of the Nvidia RTX // CEUR Workshop Proceedings, vol. 2485 (2019), p. 7-12

15. S.V.Ershov, D.D.Zhdanov, A.G.Voloboy, V.A.Galaktionov. Two denoising algorithms for bi-directional Monte Carlo ray tracing // Mathematica Montisnigri, Vol. XLIII, 2018, p. 78-100. https://lppm3.ru/files/journal/XLIII/MathMontXLIII-Ershov.pdf

16. Alhaija, Hassan Abu, et al. Augmented reality meets computer vision: Efficient data generation for urban driving scenes. // International Journal of Computer Vision 126.9 (2018): 961-972.

17. Dosovitskiy, Alexey, et al. Flownet: Learning optical flow with convolutional networks. // Proceedings of the IEEE international conference on computer vision. 2015.

18. Varol, Gul, et al. Learning from synthetic humans. // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.

19. Chen, Wenzheng, et al. Synthesizing training images for boosting human 3d pose estimation. // 2016 Fourth International Conference on 3D Vision (3DV). IEEE, 2016.

20. Tobin, Josh, et al. Domain randomization for transferring deep neural networks from simulation to the real world. // 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2017

21. Prakash, Aayush, et al. Structured domain randomization: Bridging the reality gap by context-aware synthetic data. // 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019

22. Mitash, Chaitanya, Kostas E. Bekris, and Abdeslam Boularias. A self-supervised learning system for object detection using physics simulation and multi-view pose estimation. // 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017.

23. Fremont, Daniel J., et al. Scenic: a language for scenario specification and scene generation. // Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2019

24. Armeni, Iro, et al. 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera. // Proceedings of the IEEE International Conference on Computer Vision. 2019

25. Jiang, Chenfanfu, et al. "Configurable 3d scene synthesis and 2d image rendering with per-pixel ground truth using stochastic grammars." International Journal of Computer Vision 126.9 (2018): 920-941.

26. Kar, Amlan, et al. Meta-sim: Learning to generate synthetic datasets. // Proceedings of the IEEE International Conference on Computer Vision. 2019.

27. Hoffman, Judy, et al. Cycada: Cycle-consistent adversarial domain adaptation. // arXiv preprint arXiv:1711.03213 (2017).

28. French, Geoffrey, Michal Mackiewicz, and Mark Fisher. Self-ensembling for visual domain adaptation. // arXiv preprint arXiv:1706.05208 (2017)

29. Ray Tracing Systems, Keldysh Institute of Applied Mathematics, Moscow State Uiversity. Hydra Renderer. Open source rendering system, 2019, https://github.com/Ray-Tracing-Systems/HydraAPI

30. V.V. Sanzharov, V.F. Frolov. Level of Detail for Precomputed Procedural Textures // Programming and Computer Software, 2019, V. 45, Issue 4, pp. 187-195 DOIhttps://doi.org/10.1134/S0361768819040078

31. Natron, Open Source Compositing Software For VFX and Motion Graphics https://natrongithub.github.io/

32. A.E. Bondarev. On visualization problems in a generalized computational experiment (2019). Scientific Visualization 11.2: 156 - 162, DOI:https://doi.org/10.26583/sv.11.2.12 (Scopus) http://www.sv-journal.org/2019-2/12/

33. Chang, Angel X., et al. "Shapenet: An information-rich 3d model repository." arXiv preprint arXiv:1512.03012 (2015).

Login or Create
* Forgot password?