TMPA-2021: Agenda
Please mind that the upper time is indicated in the GMT+3 time zone (the time zone is currently observed in Moscow, Istanbul, Jerusalem, Djibouti, Baghdad, etc.), and the lower time is indicated in your local time zone.
Day 1 25 November |
Papers | Speaker/Author | |
---|---|---|---|
Session 1 | Neural networks | ||
9:20 | Conference Opening | Rostislav Yavorsky, TPU | |
9:30 | Formal Methods: Theory and Practice of Linux Verification Center (Part 1) | Alexey Khoroshilov, ISP RAS | keynote speaker |
Alexey Khoroshilov, Lead Researcher, Ivannikov Institute for System Programming of the Russian Academy of Sciences Abstract: The talk will review the challenges that formal methods meet being applied in industrial settings, as well as the patterns that are often used to overcome these challenges. Alexey will also share the experience of using patterns within the projects of Linux Verification Center of ISPRAS. |
|||
10:20 | Investigation of the capabilities of artificial neural networks in the problem of classifying objects with dynamic features | Nikita Laptev, Vladislav Laptev, Gerget Olga, Dmitrii Kolpashchikov and Andrey Kravchenko | |
Dmitrii Kolpashchikov, Engineer, Tomsk Polytechnic University Abstract: Image classification is a classic machine learning (ML) problem. Neural net-works are widely used in the problem of object classification. Despite the ex-istence of a large number of image classification algorithms, very little atten-tion is paid to the issue of video data classification. In the case of using con-volutional neural networks to classify frames of a video sequence, it is nec-essary to combine image features to obtain a prediction. However, with this approach, the signs of object dynamics will be ignored, since the images are processed sequentially. Therefore, the issue of analyzing objects with dy-namically changing characteristics remains relevant. To solve this issue, the authors propose to use a neural network with long-term short-term memory (LSTM). In contrast to classical convolutional neural networks (CNN), the proposed network uses information about the sequence of images, thereby providing a higher classification accuracy of detected objects with dynamic characteristics. In the study, the authors analyze the classification accuracy of smoke cloud detection in a forest using various machine learning meth-ods. In the work, the authors present models for the classification of one frame and a sequence of frames of a video sequence. The results of the work of machine learning models are presented, as well as a comparative analysis of the classification of one frame and a sequence of frames. The accuracy of the video sequence classification by the model of a recurrent neural network with an LSTM layer was 85.7%. |
|||
10:45 | Open-Source Tools for Neural Network Inference on FPGAs | Mikhail Lebedev and Pavel Belecky | |
Mikhail Lebedev, Plekhanov Russian University of Economics, Abstract: Artificial neural networks play a great role in modern life. Neural networks are being executed on different hardware platforms: from CPUs and GPUs to FPGAs and ASICs. Many open-source tools help to optimize, and run inference on these platforms or even synthesize specialized hardware. This article contains a survey of a range of open-source tools for neural network optimization, acceleration and hardware synthesis. Some of the tools have been evaluated using three simple neural network examples. A CPU, GPU and FPGA devices have been used for evaluation. Results show that some of the chosen tools can successfully process neural network models and optimize them for CPU and GPU execution, whereas FPGA execution results are controversial. |
|||
11:10 | Bayesian Optimization with Time-Decaying Jitter for Hyperparameter Tuning of Neural Networks | Konstantin Maslov | |
Konstantin Maslov, PhD Student, Tomsk Polytechnic University Abstract: This paper introduces a modification of the ordinary Bayesian optimization algorithm for hyperparameter tuning of neural networks. The proposed algorithm utilizes time-decaying parameter ξ (jitter) to dynamically balance between exploration and exploitation. This algorithm is compared with the ordinary Bayesian optimization algorithm with various constant values of ξ; for that, diverse artificial landscapes were used. In this comparison, for some artificial landscapes and numbers of dimensions of the search domain, the proposed algorithm shows a better performance. For some others, the ordinary algorithm outperforms the proposed one, but in most cases there is no statistically significant difference between the two algorithms. Both algorithms then are used to tune hyperparameters of a neural network for semantic image segmentation. The corresponding analysis has shown that the both algorithms give a comparable performance. |
|||
11:35 | Analysis of Hardware-Implemented U-Net–like Convolutional Neural Networks | Ivan Zoev, Nikolay Markov, Konstantin Maslov and Evgeniy Mytsko | |
Evgeniy Mytsko, Associate Professor, Tomsk Polytechnic University Abstract: Two convolutional neural networks (CNNs)—U Net and U Net with the use of dilated convolutions—were implemented. In order to train and test the CNNs, we utilised unmanned aerial vehicle images containing Abies sibirica trees damaged by Polygraphus proximus. The images consist five classes: four classes of the trees depending on their condition and background. The weights of the CNNs, obtained as a result of the training, were then used to implement the CNNs in the field-programmable gate array–based system on a chip platform (Xilinx Zynq 7000). The paper also presents a comparison of the hardware-implemented CNNs in terms of classification quality and time efficiency. |
|||
12:00 | Break | ||
Session 2 | Process Mining | ||
13:00 | Meta-heuristic Techniques and Their Applications | Mohamed Elsayed Ahmed Mohamed, TPU | keynote speaker |
Mohamed Elsayed Ahmed Mohamed (Abd Elaziz), Professor at School of Computer Science and Robotics, TPU Abstract: The field of metaheuristics (MH) techniques has flourished over the years due to their high influence on the performance of different artificial intelligence techniques that are applied in various real-world applications. These MH techniques include four categories 1) swarm-based, 2) Evolutionary-based 3) Human-based, and 4) physical-based. According to simulate these behaviors, the MH techniques established their performance to solve the single or multi-objective optimization problems. For example, Cloud computing, Healthcare, engineering problem, and others |
|||
13:50 | Fair Mutual Exclusion for N Processes | Yousra Hafidi, Jeroen J.A. Keiren and Jan Friso Groote, Eindhoven University of Technology | |
Yousra Hafidi, Eindhoven University of Technology Abstract: Peterson's mutual exclusion algorithm for two processes has been generalized to N processes in various ways. As far as we know, no such generalization is starvation free without making any fairness assumptions. In this paper, we study the generalization of Peterson's algorithm to N processes using a tournament tree. Using the mCRL2 language and toolset we prove that it is not starvation free unless weak fairness assumptions are incorporated. Inspired by the counterexample for starvation freedom, we propose a fair N-process generalization of Peterson's algorithm. We use model checking to show that our new algorithm is correct for small N. For arbitrary N, model checking is infeasible due to the state space explosion problem, and instead, we present a general proof that, for N 4, when a process requests access to the critical section, other processes can enter first at most (N - 1)(N - 2) times. |
|||
14:15 | Process Mining Algorithm for Online Intrusion Detection System | Yinzheng Zhong, Yannis Goulermas and Alexei Lisitsa | |
Yinzheng Zhong, University of Liverpool Abstract: In this paper, we consider the applications of process mining in intrusion detection. We propose a novel process mining inspired algorithm to be used to preprocess data in intrusion detection systems (IDS). The algorithm is designed to process the network packet data and it works well in online mode for online intrusion detection. To test our algorithm, we used the CSE-CIC-IDS2018 dataset which contains several common attacks. The packet data was preprocessed with this algorithm and then fed into the detectors. We report on the experiments using the algorithm with different machine learning (ML) models as classifiers to verify that our algorithm works as expected; we tested the performance on anomaly detection methods as well and reported on the existing preprocessing tool CICFlowMeter for the comparison of performance. |
|||
14:40 | Closing Day 1 | ||
Day 2 26 November |
Papers | Speaker/Author | |
Session 3 | Software Testing | ||
9:30 | Automation in Software Testing. Humans and Complex Models. | Iosif Itkin, Exactpro | keynote speaker |
Iosif Itkin, CEO and co-founder, Exactpro Abstract: As part of the TMPA-2021 conference, Exactpro’s CEO and co-founder Iosif Itkin will give a presentation on Automation in Software Testing. Humans and Complex Models. In this presentation, Iosif will give a brief overview of research on the concept of model-based testing and the principal challenges of its application while testing complex distributed systems. He will also outline the broader context of interaction between humans and complex computer models. |
|||
10:20 | Searching for Deviations in Trading Systems: Combining Control-Flow and Data Perspectives | Julio Cesar Carrasquel and Irina Lomazova, HSE | |
Julio Cesar Carrasquel, HSE Abstract: Trading systems are software platforms that support exchange of securities (e.g., company shares) between participants. In this paper, we present a method to search for deviations in trading systems by checking conformance between colored Petri nets and event logs. Colored Petri nets (CPNs) are an extension of Petri nets, a formalism for modeling of distributed systems, which allow to describe an expected causal ordering between system activities and how data attributes of domain-related objects (e.g., orders to trade) must be transformed. Event logs consist of traces corresponding to runs of a real system. By comparing CPNs and event logs, different types of deviations can be detected. Using this method, we report the validation of a real-life trading system. |
|||
10:45 | Data Stream Processing in Reconciliation Testing: Industrial Experience | Iosif Itkin, Nikolay Dorofeev, Stanislav Glushkov, Alexey Yermolayev and Elena Treshcheva, Exactpro | |
Alexey Yermolayev, QA Project Manager, Exactpro Nikolay Dorofeev, Senior DocOps Engineer Abstract: The focus area of this research is around tools and methods of reconciliation testing, an approach to software testing that relies on the data reconciliation concept. The importance of such a test approach is steadily increasing across different knowledge domains, triggered by growing data volumes and overall complexity of present-day software systems. The paper describes the software implementation created as part of the authors’ industrial experience with data streaming analysis for the task of reconciliation testing of complex financial technology systems. The described solution is a Python-based component of an open-source test automation framework build as a Kubernetes-based microservices platform. The paper outlines the advantages and disadvantages of the approach as well as compares it to existing state-of-the-art solutions allowing for data streaming analysis and reconciliation checks. |
|||
11:10 | Model-based Testing Approach for Financial Technology Platforms: An Industrial Implementation | Luba Konnova, Ivan Scherbinin, Vyacheslav Okhlopkov, Levan Gharibashvili, Mariam Mtsariashvili and Tiniko Babalashvili, Exactpro | |
Ivan Scherbinin, Project Manager, Exactpro Tiniko Babalashvili, Junior Software Tester, Exactpro Luba Konnova, Senior Project Manager, Exactpro Levan Gharibashvili, Middle Python Developer, Exactpro Abstract: This paper looks at the industrial experience of using automated model-based approach for the testing of trading systems. The approach, used by Exactpro, is described using two existing taxonomies. Then, the main future applications of the models in the test automation paradigm are outlined. In our approach a model is kind of a virtual replica of the test system generating expected results from the input (built based on the specifications). Models are created in python which provides flexibility of describing complex financial systems behaviors. Models are integrated into the Exactpro th2 test automation framework, and expected results from the system under test and model are compared automatically. |
|||
11:35 | Early Detection of Tasks With Uncommonly Long Run Duration in Post-Trade Systems | Maxim Nikiforov, Danila Gorkavchenko, Murad Mamedov, Andrey Novikov and Nikita Pushchin, Exactpro | |
Danila Gorkavchenko, QA Analyst, Exactpro Abstract: The paper describes authors' experience of implementing machine learning techniques to predict deviations in service workflows duration, long before the post-trade system reports them as not completed on time. The prediction is based on analyzing a large set of performance metrics collected every second from modules of the system, and using regression models to detect running workflows that are likely to be hung. This article covers raw data preprocessing, dataset dimensionality reduction, applied regression models and their performance. Problems to be resolved and project roadmap are also described. |
|||
12:00 | Break | ||
Session 4 | Distributed and Decentralized Technologies | ||
13:00 | Formal Verification of the Eth2.0 beacon Chain | Franck Cassez, ConsenSys | keynote speaker |
Franck Cassez, ConsenSys Abstract: The Beacon Chain is a core component of the new Ethereum 2.0 blockchain. We have formally verified a large part of the Beacon Chain specifications and in this talk I will report on our verification experience and findings. |
|||
13:50 | Algorithm for Mapping Layout Grids in User Interfaces: Automating the “Squint Test” | Maxim Bakaev and Maxim Shirokov | |
Maxim Bakaev, PhD, Associate Professor, Novosibirsk State Technical University Abstract: Human-Computer Interaction sees increased application of AI methods, particularly for testing and assessing the characteristics of graphic user in-terfaces (UIs). For instance, target users’ subjective perceptions of visual com-plexity, aesthetic impression, trust, etc. can be predicted to some extend based on UI appearance. Buttons, text blocks, images and other elements in today’s UIs at all platforms are usually aligned to grids – i.e. regular vertical and hori-zontal lines – in order to decrease visual clutter. However, the grids are not ap-parent in the UI representations available for analysis (HTML/CSS, etc.), unlike in design mockups, and have to be reverse-engineered for the benefit of further UI assessment. In our paper we propose the algorithm for automated construction of layout grids on top of visual representations (screenshots) of existing UIs and demonstrate its work with various configuration parameters. The algorithm was inspired by the informal Squint Test known from Usability Engineering practice and is based on subsequent application of several computer vision techniques supported by OpenCV library. The main stages are edge detection, image pixelization, grid overlaying, and cell coding. The tuning of the algorithm’s configuration parameters can be performed to match the UI perception by representative users, as demonstrated in the paper. The outcome of the algorithm is a coded representation of graphical UI as a 2D matrix, which is a convenient medium for further processing. The automation of UI layouts coding can allow obtaining large datasets needed by up-to-date user behavior models that predict the quality of interaction with UIs. |
|||
14:15 | Modern experiment management systems architecture for scientific big data | Anastasiia Kaida and Aleksei Savelev | |
Anastasiia Kaida, Teaching assistant, National Research Tomsk Polytechnic University, School of Computer Science & Robotics Abstract: Experiment management systems (EMS) have more than twenty-five years of history. From small desktop prototypes to large-scale distributed systems, they become more and more complicated. The new chapter of EMS was uncovered with the age of Big Data surrounded by a special ecosystem to extract, analyze and store data. The big data ecosystem considers new elements that must be taken into account to expand the functionality for EMS to support all data lifecycle stages. One of the challenges is to highlight the key points of a huge variety of EMS evolving through time. Such systems do not usually follow a unified pattern because of special needs for each project. This paper introduces the conceptual high-level architecture as an example of a unified pattern of building EMS for big data ecosystems. The architecture does not consider to be used with the GRID-computing approach. |
|||
14:40 | An approach to modules similarity definition based on the system log analysis | Ilya Samonenko, Tamara Voznesenskaya and Rostislav Yavorskiy, TPU | |
Ilya Samonenko, Associate Professor, HSE Abstract: We study a distributed process organized as a sequence of somehow related modules. A token passes through different modules and leaves a digital trace in the log. The sequence is not deterministic yet the number of possible paths is rather low. Our goal is to figure out an optimal distance metric for the modules, which could be used to predict the digital trace of new coming tokens, so that modules with lower distance will produce more similar trace. |
|||
15:05 | Closing Day 2 | ||
Day 3 27 November |
Papers | Speaker/Author | |
Session 5 | Image Recognition | ||
9:30 | Formal Methods: Theory and Practice of Linux Verification Center (Part 2 - Testing of Operating Systems) | Alexey Khoroshilov, ISP RAS | keynote speaker |
Alexey Khoroshilov, Lead Researcher, Ivannikov Institute for System Programming of the Russian Academy of Sciences Abstract: The talk will review the challenges that formal methods meet being applied in industrial settings, as well as the patterns that are often used to overcome these challenges. Alexey will also share the experience of using patterns within the projects of Linux Verification Center of ISPRAS. |
|||
10:20 | Unpaired Image-to-Image Translation using Transformer-based CycleGAN | Chongyu Gu and Maxim Gromov | |
Chongyu Gu, Ph.D. student, TSU Maxim Gromov, Associate Professor TSU Abstract: The use of transformer-based architectures by computer vision researchers is on the rise. Recently, the implementations of GANs that use transformer-based architectures, such as TransGAN, GANformer, and ViTGAN have demonstrated profitability for the undertaking of visual generative modeling. We introduced TransCycleGAN, a novel, efficient GAN model, and explored its application to image-to-image translation. In distinction to the architectures above, our generator utilizes source images as input, not simply noise. We developed it and carried out preliminary experiments on the horse2zebra resized to 64×64. The experimental outcomes show the potential of our new architecture. An implementation of the model is available under the MIT license at GitHub. |
|||
10:45 | Link graph and data-driven graphs as complex networks: comparative study | Vasilii Gromov, HSE | |
Vasilii A. Gromov, Prof., HSE University Abstract: The link and data-driven graphs corresponding to the same dataset are studied as complex networks. It appears that a link graph features the majority of complex networks characteristics, whereas the corresponding data-driven graphs feature only part of these characteristics. Some characteristics appear to be more stable, when one moves from a link graph to data-driven graphs and over data-driven graphs, than others. In particular, one observes giant components and power-law community size distributions for most link and data-driven graphs. Also, data-driven graphs usually retain small world property and relatively large values of clustering coefficients, provided the same holds true for the respective link graph. Meanwhile, only the ε-ball neighbourhood graph and the Gabriel graph exhibit power-law degree distributions as their link counterparts do. The assortativity coefficient is essentially corrupted when one moves from a link graph to data graphs. Sometimes, assortativity alters to disassortativity. Among all data-driven graphs considered, the Gabriel graph seems to retain most properties of complex networks. |
|||
11:10 | Detection of flying objects using the YOLOv4 convolutional neural network | Semen Tkachev and Nikolay Markov | |
Semen Tkachev, Graduate student, TPU Abstract: The efficiency of the YOLOv4 convolutional neural network (CNN) in detection of objects moving in airspace is investigated. Video materials of two classes of flying objects (FO) were used as the initial data for training and testing of the CNN: helicopter-type unmanned aerial vehicles and gliders. Video materials were obtained in the optical and infrared (IR) wavelength ranges. Datasets are formed from them in the form of a set of images. Large-scale studies of the detection efficiency of the YOLOv4 CNN on such images have been conducted. It is shown that the accuracy of detection of FO in optical images is higher than in images obtained in the IR wavelength range. |
|||
11:35 | Optic Flow approximated by a homogeneous, three-dimensional Point Renewal Process (RU) | Dmitry Dubinin, Alexander Kochegurov, Elena Kochegurova and Viktor Geringer, TPU | |
The language of the presentation is Russian. Dmitry Dubinin, TPU Abstract: The paper describes a mechanism for modeling an optical flow as a random vector field – closed areas on the image plane with certain brightness and dynamics of changes in the vector field. The optical flow is formed by a homogeneous, three-dimensional point renewal process. The characteristics of the vector field on the image plane of the resulting optical flow are interconnected by the Palm’s formulas. The type of the constituent elements of the vector field (the alphabet, which determines the morphology of the field on the image plane) is chosen randomly. The proposed approach will produce various types of interconnected digital sequences of images with horizontal, vertical and diagonal elements; create the prerequisites for efficient and flexible motion analysis on digital video sequences; make it possible to use the probabilistic factor in researching image processing algorithms and comparing the algorithms based on a detailed factor analysis. |
|||
12:00 | Break | ||
Session 6 | Complex Systems Modeling | ||
13:00 | SPIDER: Specification-based Integration Defect Revealer | Vladislav Feofilaktov and Vladimir Itsykson, SPBSTU | |
Feofilaktov Vladislav, SPbPU Abstract: Modern software design practice implies widespread use in the development of ready-made components, usually designed as external libraries. The undoubted advantages of reusing third-party code can be offset by integration errors appearing in the developed software. The reason for the appearance of such errors is mainly due to a misunderstanding or incomplete understanding by the programmer of the details of external libraries such as an internal structure and the subtleties of functioning. The documentation provided with the libraries is often very sparse and describes only the main intended scenarios for the interaction of the program and the library. In this paper, we propose an approach based on the use of formal library specifications, which allows detecting integration errors using static analysis methods. To do this, the external library is described using the LibSL specification language, the resulting description is translated into the internal data structures of the KEX analyzer. The execution of incorrect scenarios of library usage, such as an incorrect sequence of method calls or a violation of the API function contract, etc., is marked in the program model with special built-in functions of the KEX analyzer. Later, when analyzing the program, KEX becomes able to detect integration errors, since incorrect library usage scenarios are diagnosed as calling marked functions. The proposed approach is implemented as SPIDER (SPecification-based Integration Defect Revealer) which is an extension of the KEX analyzer and has shown its efficiency by detecting integration errors of different classes on a number of special-made projects, as well as on several projects taken from open repositories. |
|||
13:25 | An approach to creating a synthetic financial transactions dataset based on an NDA-protected dataset | Luba Konnova, Yuri Silenok, Dmitry Fomin, Andrey Novikov, Egor Kolesnikov, Ksenia Vorontsova and Daria Degtyarenko, Exactpro | |
Luba Konnova, Senior Project Manager, Exactpro Dmitry Fomin, Senior Software Engineer, Exactpro Andrey Novikov, Managing Partner, SynData.io Egor Kolesnikov, Machine Learning Consultant, SynData.io Luba Konnova, Senior Project Manager, Exactpro Abstract: This paper outlines an experiment in building an obfuscated version of a proprietary financial transactions dataset. As per industrial requirements, no data from the original dataset should find its way to third parties, so all the fields were generated artificially, including banks, customers (including geographic locations), and particular transactions. However, we set our goal to keeping as many distributions and correlations from the original dataset as possible, with adjustable levels of introduced noise in each of the fields: geography, bank-to-bank trans-action flows, and distributions of volumes/numbers of transactions in various subsections of the dataset. The article could be of use to anyone who may want to produce a publishable dataset, e.g., for the alternative data market, where it’s essential to keep the structure and correlations of the proprietary non-disclosed original dataset. |
|||
13:50 | Closing Day 3 |