TMPA-2021: Agenda

Please mind that the upper time is indicated in the GMT+3 time zone (the time zone is currently observed in Moscow, Istanbul, Jerusalem, Djibouti, Baghdad, etc.), and the lower time is indicated in your local time zone.

Day 1
25 November
Papers Speaker/Author
Session 1 Neural networks  
9:20 Conference Opening Rostislav Yavorsky, TPU
Alexey Khoroshilov

Alexey Khoroshilov, Lead Researcher, Ivannikov Institute for System Programming of the Russian Academy of Sciences

Abstract:

The talk will review the challenges that formal methods meet being applied in industrial settings, as well as the patterns that are often used to overcome these challenges. Alexey will also share the experience of using patterns within the projects of Linux Verification Center of ISPRAS.

Dmitrii Kolpashchikov

Dmitrii Kolpashchikov, Engineer, Tomsk Polytechnic University

Abstract:

Image classification is a classic machine learning (ML) problem. Neural net-works are widely used in the problem of object classification. Despite the ex-istence of a large number of image classification algorithms, very little atten-tion is paid to the issue of video data classification. In the case of using con-volutional neural networks to classify frames of a video sequence, it is nec-essary to combine image features to obtain a prediction. However, with this approach, the signs of object dynamics will be ignored, since the images are processed sequentially. Therefore, the issue of analyzing objects with dy-namically changing characteristics remains relevant. To solve this issue, the authors propose to use a neural network with long-term short-term memory (LSTM). In contrast to classical convolutional neural networks (CNN), the proposed network uses information about the sequence of images, thereby providing a higher classification accuracy of detected objects with dynamic characteristics. In the study, the authors analyze the classification accuracy of smoke cloud detection in a forest using various machine learning meth-ods. In the work, the authors present models for the classification of one frame and a sequence of frames of a video sequence. The results of the work of machine learning models are presented, as well as a comparative analysis of the classification of one frame and a sequence of frames. The accuracy of the video sequence classification by the model of a recurrent neural network with an LSTM layer was 85.7%.

Mikhail Lebedev

Mikhail Lebedev, Ivannikov Institute for System Programming of the RAS

Abstract:

Artificial neural networks play a great role in modern life. Neural networks are being executed on different hardware platforms: from CPUs and GPUs to FPGAs and ASICs. Many open-source tools help to optimize, and run inference on these platforms or even synthesize specialized hardware. This article contains a survey of a range of open-source tools for neural network optimization, acceleration and hardware synthesis. Some of the tools have been evaluated using three simple neural network examples. A CPU, GPU and FPGA devices have been used for evaluation. Results show that some of the chosen tools can successfully process neural network models and optimize them for CPU and GPU execution, whereas FPGA execution results are controversial.

Konstantin Maslov

Konstantin Maslov, PhD Student, Tomsk Polytechnic University

Abstract:

This paper introduces a modification of the ordinary Bayesian optimization algorithm for hyperparameter tuning of neural networks. The proposed algorithm utilizes time-decaying parameter ξ (jitter) to dynamically balance between exploration and exploitation. This algorithm is compared with the ordinary Bayesian optimization algorithm with various constant values of ξ; for that, diverse artificial landscapes were used. In this comparison, for some artificial landscapes and numbers of dimensions of the search domain, the proposed algorithm shows a better performance. For some others, the ordinary algorithm outperforms the proposed one, but in most cases there is no statistically significant difference between the two algorithms. Both algorithms then are used to tune hyperparameters of a neural network for semantic image segmentation. The corresponding analysis has shown that the both algorithms give a comparable performance.

Evgeniy Mytsko

Evgeniy Mytsko, Associate Professor, Tomsk Polytechnic University

Abstract:

Two convolutional neural networks (CNNs)—U Net and U Net with the use of dilated convolutions⁠—were implemented. In order to train and test the CNNs, we utilised unmanned aerial vehicle images containing Abies sibirica trees damaged by Polygraphus proximus. The images consist five classes: four classes of the trees depending on their condition and background. The weights of the CNNs, obtained as a result of the training, were then used to implement the CNNs in the field-programmable gate array–based system on a chip platform (Xilinx Zynq 7000). The paper also presents a comparison of the hardware-implemented CNNs in terms of classification quality and time efficiency.

12:00 Break  
Session 2 Process Mining  
Mohamed Elsayed Ahmed Mohamed (Abd Elaziz)

Mohamed Elsayed Ahmed Mohamed (Abd Elaziz), Professor at School of Computer Science and Robotics, TPU

Abstract:

The field of metaheuristics (MH) techniques has flourished over the years due to their high influence on the performance of different artificial intelligence techniques that are applied in various real-world applications. These MH techniques include four categories 1) swarm-based, 2) Evolutionary-based 3) Human-based, and 4) physical-based. According to simulate these behaviors, the MH techniques established their performance to solve the single or multi-objective optimization problems. For example, Cloud computing, Healthcare, engineering problem, and others

Yousra Hafidi

Yousra Hafidi, Eindhoven University of Technology

Abstract:

Peterson's mutual exclusion algorithm for two processes has been generalized to N processes in various ways. As far as we know, no such generalization is starvation free without making any fairness assumptions. In this paper, we study the generalization of Peterson's algorithm to N processes using a tournament tree. Using the mCRL2 language and toolset we prove that it is not starvation free unless weak fairness assumptions are incorporated. Inspired by the counterexample for starvation freedom, we propose a fair N-process generalization of Peterson's algorithm. We use model checking to show that our new algorithm is correct for small N. For arbitrary N, model checking is infeasible due to the state space explosion problem, and instead, we present a general proof that, for N 4, when a process requests access to the critical section, other processes can enter first at most (N - 1)(N - 2) times.

Yinzheng Zhong

Yinzheng Zhong, University of Liverpool

Abstract:

In this paper, we consider the applications of process mining in intrusion detection. We propose a novel process mining inspired algorithm to be used to preprocess data in intrusion detection systems (IDS). The algorithm is designed to process the network packet data and it works well in online mode for online intrusion detection. To test our algorithm, we used the CSE-CIC-IDS2018 dataset which contains several common attacks. The packet data was preprocessed with this algorithm and then fed into the detectors. We report on the experiments using the algorithm with different machine learning (ML) models as classifiers to verify that our algorithm works as expected; we tested the performance on anomaly detection methods as well and reported on the existing preprocessing tool CICFlowMeter for the comparison of performance.

14:40 Closing Day 1  
Day 2
26 November
Papers Speaker/Author
Session 3 Software Testing  
Iosif Itkin

Iosif Itkin, CEO and co-founder, Exactpro

Abstract:

As part of the TMPA-2021 conference, Exactpro’s CEO and co-founder Iosif Itkin will give a presentation on Automation in Software Testing. Humans and Complex Models. In this presentation, Iosif will give a brief overview of research on the concept of model-based testing and the principal challenges of its application while testing complex distributed systems. He will also outline the broader context of interaction between humans and complex computer models.

Alexey Yermolayev

Alexey Yermolayev, QA Project Manager, Exactpro

Nikolay Dorofeev

Nikolay Dorofeev, Senior DocOps Engineer

Abstract:

The focus area of this research is around tools and methods of reconciliation testing, an approach to software testing that relies on the data reconciliation concept. The importance of such a test approach is steadily increasing across different knowledge domains, triggered by growing data volumes and overall complexity of present-day software systems. The paper describes the software implementation created as part of the authors’ industrial experience with data streaming analysis for the task of reconciliation testing of complex financial technology systems. The described solution is a Python-based component of an open-source test automation framework build as a Kubernetes-based microservices platform. The paper outlines the advantages and disadvantages of the approach as well as compares it to existing state-of-the-art solutions allowing for data streaming analysis and reconciliation checks.

Ivan Scherbinin

Ivan Scherbinin, Project Manager, Exactpro

Tiniko Babalashvili

Tiniko Babalashvili, Junior Software Tester, Exactpro

Luba Konnova

Luba Konnova, Senior Project Manager, Exactpro

Levan Gharibashvili

Levan Gharibashvili, Middle Python Developer, Exactpro

Abstract:

This paper looks at the industrial experience of using automated model-based approach for the testing of trading systems. The approach, used by Exactpro, is described using two existing taxonomies. Then, the main future applications of the models in the test automation paradigm are outlined. In our approach a model is kind of a virtual replica of the test system generating expected results from the input (built based on the specifications). Models are created in python which provides flexibility of describing complex financial systems behaviors. Models are integrated into the Exactpro th2 test automation framework, and expected results from the system under test and model are compared automatically.

Danila Gorkavchenko

Danila Gorkavchenko, QA Analyst, Exactpro

Abstract:

The paper describes authors' experience of implementing machine learning techniques to predict deviations in service workflows duration, long before the post-trade system reports them as not completed on time. The prediction is based on analyzing a large set of performance metrics collected every second from modules of the system, and using regression models to detect running workflows that are likely to be hung. This article covers raw data preprocessing, dataset dimensionality reduction, applied regression models and their performance. Problems to be resolved and project roadmap are also described.

12:00 Break  
Session 4 Distributed and Decentralized Technologies  
Franck Cassez

Franck Cassez, ConsenSys

Abstract:

The Beacon Chain is a core component of the new Ethereum 2.0 blockchain. We have formally verified a large part of the Beacon Chain specifications and in this talk I will report on our verification experience and findings.
Links:
GitHub repo
Report

Maxim Bakaev

Maxim Bakaev, PhD, Associate Professor, Novosibirsk State Technical University

Abstract:

Human-Computer Interaction sees increased application of AI methods, particularly for testing and assessing the characteristics of graphic user in-terfaces (UIs). For instance, target users’ subjective perceptions of visual com-plexity, aesthetic impression, trust, etc. can be predicted to some extend based on UI appearance. Buttons, text blocks, images and other elements in today’s UIs at all platforms are usually aligned to grids – i.e. regular vertical and hori-zontal lines – in order to decrease visual clutter. However, the grids are not ap-parent in the UI representations available for analysis (HTML/CSS, etc.), unlike in design mockups, and have to be reverse-engineered for the benefit of further UI assessment. In our paper we propose the algorithm for automated construction of layout grids on top of visual representations (screenshots) of existing UIs and demonstrate its work with various configuration parameters. The algorithm was inspired by the informal Squint Test known from Usability Engineering practice and is based on subsequent application of several computer vision techniques supported by OpenCV library. The main stages are edge detection, image pixelization, grid overlaying, and cell coding. The tuning of the algorithm’s configuration parameters can be performed to match the UI perception by representative users, as demonstrated in the paper. The outcome of the algorithm is a coded representation of graphical UI as a 2D matrix, which is a convenient medium for further processing. The automation of UI layouts coding can allow obtaining large datasets needed by up-to-date user behavior models that predict the quality of interaction with UIs.

Anastasiia Kaida

Anastasiia Kaida, Teaching assistant, National Research Tomsk Polytechnic University, School of Computer Science & Robotics

Abstract:

Experiment management systems (EMS) have more than twenty-five years of history. From small desktop prototypes to large-scale distributed systems, they become more and more complicated. The new chapter of EMS was uncovered with the age of Big Data surrounded by a special ecosystem to extract, analyze and store data. The big data ecosystem considers new elements that must be taken into account to expand the functionality for EMS to support all data lifecycle stages. One of the challenges is to highlight the key points of a huge variety of EMS evolving through time. Such systems do not usually follow a unified pattern because of special needs for each project. This paper introduces the conceptual high-level architecture as an example of a unified pattern of building EMS for big data ecosystems. The architecture does not consider to be used with the GRID-computing approach.

Ilya Samonenko

Ilya Samonenko, Associate Professor, HSE

Abstract:

We study a distributed process organized as a sequence of somehow related modules. A token passes through different modules and leaves a digital trace in the log. The sequence is not deterministic yet the number of possible paths is rather low. Our goal is to figure out an optimal distance metric for the modules, which could be used to predict the digital trace of new coming tokens, so that modules with lower distance will produce more similar trace.

15:05 Closing Day 2  
Day 3
27 November
Papers Speaker/Author
Session 5 Image Recognition  
Alexey Khoroshilov

Alexey Khoroshilov, Lead Researcher, Ivannikov Institute for System Programming of the Russian Academy of Sciences

Abstract:

The talk will review the challenges that formal methods meet being applied in industrial settings, as well as the patterns that are often used to overcome these challenges. Alexey will also share the experience of using patterns within the projects of Linux Verification Center of ISPRAS.

Chongyu Gu

Chongyu Gu, Ph.D. student, TSU

Maxim Gromov

Maxim Gromov, Associate Professor TSU

Abstract:

The use of transformer-based architectures by computer vision researchers is on the rise. Recently, the implementations of GANs that use transformer-based architectures, such as TransGAN, GANformer, and ViTGAN have demonstrated profitability for the undertaking of visual generative modeling. We introduced TransCycleGAN, a novel, efficient GAN model, and explored its application to image-to-image translation. In distinction to the architectures above, our generator utilizes source images as input, not simply noise. We developed it and carried out preliminary experiments on the horse2zebra resized to 64×64. The experimental outcomes show the potential of our new architecture. An implementation of the model is available under the MIT license at GitHub.

Semen Tkachev

Semen Tkachev, Graduate student, TPU

Abstract:

The efficiency of the YOLOv4 convolutional neural network (CNN) in detection of objects moving in airspace is investigated. Video materials of two classes of flying objects (FO) were used as the initial data for training and testing of the CNN: helicopter-type unmanned aerial vehicles and gliders. Video materials were obtained in the optical and infrared (IR) wavelength ranges. Datasets are formed from them in the form of a set of images. Large-scale studies of the detection efficiency of the YOLOv4 CNN on such images have been conducted. It is shown that the accuracy of detection of FO in optical images is higher than in images obtained in the IR wavelength range.

Dmitry Dubinin

The language of the presentation is Russian.

Dmitry Dubinin, TPU

Abstract:

The paper describes a mechanism for modeling an optical flow as a random vector field – closed areas on the image plane with certain brightness and dynamics of changes in the vector field. The optical flow is formed by a homogeneous, three-dimensional point renewal process. The characteristics of the vector field on the image plane of the resulting optical flow are interconnected by the Palm’s formulas. The type of the constituent elements of the vector field (the alphabet, which determines the morphology of the field on the image plane) is chosen randomly. The proposed approach will produce various types of interconnected digital sequences of images with horizontal, vertical and diagonal elements; create the prerequisites for efficient and flexible motion analysis on digital video sequences; make it possible to use the probabilistic factor in researching image processing algorithms and comparing the algorithms based on a detailed factor analysis.

Handout (EN)

12:00 Break  
Session 6 Complex Systems Modeling  
Feofilaktov Vladislav

Feofilaktov Vladislav, SPbPU

Abstract:

Modern software design practice implies widespread use in the development of ready-made components, usually designed as external libraries. The undoubted advantages of reusing third-party code can be offset by integration errors appearing in the developed software. The reason for the appearance of such errors is mainly due to a misunderstanding or incomplete understanding by the programmer of the details of external libraries such as an internal structure and the subtleties of functioning. The documentation provided with the libraries is often very sparse and describes only the main intended scenarios for the interaction of the program and the library. In this paper, we propose an approach based on the use of formal library specifications, which allows detecting integration errors using static analysis methods. To do this, the external library is described using the LibSL specification language, the resulting description is translated into the internal data structures of the KEX analyzer. The execution of incorrect scenarios of library usage, such as an incorrect sequence of method calls or a violation of the API function contract, etc., is marked in the program model with special built-in functions of the KEX analyzer. Later, when analyzing the program, KEX becomes able to detect integration errors, since incorrect library usage scenarios are diagnosed as calling marked functions. The proposed approach is implemented as SPIDER (SPecification-based Integration Defect Revealer) which is an extension of the KEX analyzer and has shown its efficiency by detecting integration errors of different classes on a number of special-made projects, as well as on several projects taken from open repositories.

Luba Konnova

Luba Konnova, Senior Project Manager, Exactpro

Dmitry Fomin

Dmitry Fomin, Senior Software Engineer, Exactpro

Andrey Novikov

Andrey Novikov, Managing Partner, SynData.io

Egor Kolesnikov

Egor Kolesnikov, Machine Learning Consultant, SynData.io

Luba Konnova, Senior Project Manager, Exactpro

Abstract:

This paper outlines an experiment in building an obfuscated version of a proprietary financial transactions dataset. As per industrial requirements, no data from the original dataset should find its way to third parties, so all the fields were generated artificially, including banks, customers (including geographic locations), and particular transactions. However, we set our goal to keeping as many distributions and correlations from the original dataset as possible, with adjustable levels of introduced noise in each of the fields: geography, bank-to-bank trans-action flows, and distributions of volumes/numbers of transactions in various subsections of the dataset. The article could be of use to anyone who may want to produce a publishable dataset, e.g., for the alternative data market, where it’s essential to keep the structure and correlations of the proprietary non-disclosed original dataset.

13:50 Closing Day 3