Big Data

"Big Data" is a broad term referring to data sets so large or complex that traditional data processing applications are insufficient to handle them. Big data sets must be processed using distributed computing technology such as Hadoop or Spark.

The value of data lies in its unlimited reusability, giving it alternative value. Data collection is important, but not exhaustive, since a significant part of its value lies in its use, not in storage as such. Inability to handle big data with respect to confidentiality, forecasting or misinterpretation may entail significant adverse consequences for companies.

If these difficulties are being confronted and there is no way of coping with the expansion of business requirements in terms of accessing and analyzing additional data, STG stands ready to assist in addressing these issues.

STG will help you take advantage of the best Big Data technologies to use, expand and apply your data. Our expertise with Hadoop platforms based on massive parallel processing (MPP), SPARK, cloud and local storage systems and other emerging technologies will help your company leverage the experience, gain the advantages and benefits of using Big Data.

STG provides the following services in this area:

  • Information systems engineering;
  • Architecture development and implementation of a computing cluster for processing, training, recognition and analysis;
  • Installation and adjustment of Cloudera, Hortonworks and other Hadoop distribution packages;
  • Data processing using Hive, Sqoop, Pig, and Spark;
  • Java, Python and Scala development;
  • Integration with Amazon Web Services (AWS) and Microsoft Azure environments;
  • NoSQL processing with MongoDB, HBase etc;
  • Relational analysis with Redshift, Vertica, Teradata and other systems;
  • Analysis using Spark, R, Python etc;
  • Using NVidia GPUs to accelerate processing, training, recognition and analysis;
  • Visualization and analysis with Datameer, Alpine Labs Data, Tableau, etc.

Our professionals boast deep expertise in the methods of statistical data analysis and the development of mathematical models (neuronal networks, Bayesian networks, clusterization, regressive, factoral, dispersional and correlational analysis, etc.), have a solid background working with such statistical tools as SPSS, R, MATLAB, SAS Data Miner, are able to solve tasks involving the analytical processing of data arrays (clusterization, classification, forecasting, pattern detection) in different applied subject areas, and use the following CAT methods: linear regression, logistic regression, artificial neuronal networks - including spike-neuronal networks, the support-vector method, and the nearest-neighbor method.

STG develops and deploys information security systems designed to provide comprehensive support to take control, analyze and defend against threat vulnerabilities. These systems also make it possible to eliminate the probability of information loss or copying, thereby enhancing your enterprise management efficiency. STG provides "turnkey" solutions and implements projects of any complexity for a wide range Big Data objectives.

STG has significant experience in the field of statistical modeling. The scope of application of the statistical modeling method is quite extensive. It includes modeling random processes, determining the scatter parameters of random variables, calculating integrals, solving equation systems, solving queuing problems, solving problems in game theory, optimizing functions in the random search for solutions, etc. Statistical modeling is applied in the following areas:

  • Business processes;
  • Business simulation;
  • Military operations;
  • Population dynamics;
  • Road traffic;
  • IT infrastructure;
  • Mathematical modeling of historical processes;
  • Logistics;
  • Pedestrian dynamics;
  • Production;
  • Market and competition;
  • Service centers;
  • Supply chain;
  • Street traffic;
  • Project management;
  • Healthcare economics;
  • Ecosystem;
  • Information security;;
  • Relay protection.