Lecture
1 Patterns found in the process of using Data technology
must have the following properties:
a) be obvious;
b) be non-obvious;
c) be practically useful;
d) be objective;
d) the more patterns are found, the shortest. 25
2 Select the feature that best suits Data Mining.
a) is suitable for understanding retrospective data;
b) relies on retrospective data to get answers to
the question of the future;
c) suitable for generalization of historical data.
3 Data is:
a) facts and graphics;
b) text;
c) pictures, sounds, analog or digital video segments;
d) all together.
4 For which scale are applicable only such operations as "equal" to "not
equal to "," more "," less ":
a) nominal scale;
b) ordinal scale;
c) interval scale?
5 What are the two groups are divided data mining methods for principles
mind work with the original educational data:
a) direct use of data or data storage;
b) identification and use of formalized patterns;
c) statistical methods;
d) cybernetic methods?
6 Which of the following stages can be considered additional?
or part of one of the main stages of data mining:
a) identification of patterns (free search);
b) using the identified patterns to predict non-
known values (predictive modeling);
c) validation.
7 Fuzzy logic and decision tree:
a) relate to statistical data mining methods;
b) relate to cybernetic data mining methods;
c) is not a method of data mining.
8 The following wording is correct: “the association is part of the
bite sequence with a time lag equal to zero ":
a) the wording is correct;
b) the sequence is part of a random association;
c) neither the sequence nor the association are separate cases of one
foot? 26
9 The basis of the so-called information pyramid is the category:
a) data;
b) knowledge;
c) information.
10 Tasks of Data Mining, depending on the models used
divided into:
a) training with a teacher;
b) learning without a teacher;
c) descriptive;
d) predictive.
11 Highlight the two main areas of Web Mining:
a) Web Content Mining;
b) Web Usage Mining;
c) Web Text Mining.
12 Which of the following areas aims to identify
patterns in the actions of the user of the website or their group:
a) Web Content Mining;
b) Web Usage Mining;
c) Web Text Mining?
13 The final node of the decision tree is also called:
a) check node;
b) solution node;
c) by letter.
14. Bayesian networks has the following advantages:
a) avoids the problem of retraining;
b) determine the dependencies between all variables;
c) only individual knowledge affects the result of the classification.
input variables.
15 Using the support vector method, the following tasks are solved:
a) binary classification;
b) not only binary classification;
c) forecasting.
16 Traditional imaging techniques may find the following.
changes:
a) provide the user with information in a visual form;
b) compactly describe the patterns inherent in the original
boron data, 27
c) reduce the dimension or compress information;
d) simplify calculations in the model;
e) restoration of gaps in the data set.
17 Finding noise and outliers in the data:
a) possibly with the help of visualization tools;
b) impossible using visualization tools;
c) is not a visualization function.
18. What are the characteristics of one of the main trends in the field of
zoos:
a) an increase in the size of data structures representing
with visualization;
b) the complications of data structures representing using
zoons;
c) reducing the size of data structures that represent
with visualization.
19 A quality data cleaning program should:
a) correct incorrect data;
b) create a small report on suspicious records;
c) require minimal installation, maintenance and
manual checks;
d) to correct absolutely all suspicious data.
20. A quality data cleansing program should have these characteristics.
tiki:
a) correct incorrect data;
b) create a small report on suspicious records;
c) require minimal installation, maintenance and
manual checks;
d) may partially affect the correct data.
21 Such a function of data cleansing tools as improvement means:
a) adding to the data additional facts about records that are not in them
provided;
b) removing duplicate data;
c) removal of noise and emissions in the data system.
22 At what stage does the work of a subject specialist intersect?
Government and data mining specialist:
a) analysis of business processes;
b) data analysis;
c) data preparation;
d) all answers are wrong? 28
23 Analysis of the subject area and interpretation of the results obtained
as a result of Data Mining, these are the points of contact between such specialists as:
a) subject specialist;
b) data mining specialist;
c) database administrator;
g) all together.
24. The analysis of data requirements and data collection are the points of contact between such fades.
Like a way:
a) subject specialist;
b) data mining specialist;
c) database administrator;
g) all together.
25 There are the following solutions for the implementation of tools
Data mining:
a) development of a Data Mining product to order a third-party
lady;
b) developing the Data Mining product independently;
c) a combination of these options, incl. use of different libraries
components and toolkits for developers creating
embedded data mining applications.
26 Data mining tools can solve:
a) only one Data Mining task;
b) several Data Mining tasks;
c) all Data Mining tasks;
d) it depends on the specific tool.
27 Which tasks solution provides data analysis algorithms in
PolyAnalyst:
a) modeling;
b) forecasting;
c) clustering;
d) classification;
e) text analysis;
e) all answers are correct.
28 Deductor Studio:
a) can function without data storage;
b) may receive information from any other sources;
c) cannot function without data storage.
29 Describe the capabilities of the Deductor package for filling gaps:
a) there is no possibility to fill in the gaps; 29
b) there is the possibility of filling gaps by the method of approximation;
c) it is possible to fill gaps using an algorithm,
substitutes the most likely value instead of missing data.
30 Which of the presented algorithms are implemented in the Deductor package:
a) neural networks;
b) autocorrelation;
c) decision tree;
d) self-organizing cards;
e) association rules;
e) are all the answers correct?
1 For which scale apply only such operations as: equal, not
anymore, less?
2 What are the two groups are divided data mining methods for principles
mind work with the original educational data?
3 Which of the stages can be considered as an additional or integral part
is one of the main stages of data mining?
4 The following wording is correct: “the association is part of the
sequence with a time lag of zero "?
5 What is the basis of the so-called information pyramid?
6 Which categories are Data Mining tasks divided into, depending on
models used? sixteen
7 For what tasks are hierarchical algorithms applied?
8 What are the similarity of classification and forecasting tasks?
9. Which of the directions includes the identification of patterns in actions of pol-
stuvacha web site or their group?
10 What factors influence the result of classification in a naive
Bayesian approach?
11 What are the benefits of using Bayesian me?
dir?
12 What tasks are solved using the support vector method?
13 What is a group of neuron synapses?
14 Determine the main function of the artificial neuron.
15 What is the training of self-organizing networks?
16 Identify additions to traditional imaging techniques?
17. What are the characteristics of one of the main trends in
visualization.
18 What does data warehouse integration mean?
19 What are the main concepts of the data warehouse.
20 If the data set is ordered and there is a seasonal
or cyclic component, then what is the minimum amount of data needed
to have for analysis?
21 What should a quality data cleaning program do?
22 What characteristics should have a quality cleaning program
data?
23 At what stage does the work of the subject specialist intersect?
Government and data mining specialist?
24. Name the characteristics peculiar to SAS Enterprise Miner.
25 What tasks do the data analysis algorithms provide for?
PolyAnalyst?
26 Describe the capabilities of the Deductor package for filling gaps.
27 What algorithms are implemented in the Deductor package?
28 There is a need for temporary or permanent copying.
data to analyze in the KXEN system?
29 Which component of KXEN allows for the identification of natural groups (class
ter) in the data set?
30 What are the weaknesses of using the finished software?
Data mining baking.
12 Recommended literature
12.1. The main
1 Barsegyan A. A. Data analysis methods and models: OLAP and Data
Mining / A. A. Barseghyan, M. S. Kupriyanov, V. V. Stepanenko, and others. - Second
ed., pererab. and add. - SPb. : BHV-Petersburg, 2004 - 336 p.
2 Duke V. "Data Mining": a training course / V. Duke, A. Samoilenko. -
SPb. : Peter, 2001 - 368 p.
3 Tasks and guidelines for the implementation of laboratory
"Neural Networks" on the course "Prediction of
economic processes "for students majoring 8.050102
"Economic Cybernetics" full-time education / comp. N. A. Oak
Rovinj, A.V. Milov, S.V. Prokopovich. - H .: Izd. KhNUE, 2005 - 60 p.
4 Korneev V. V. Databases. Intelligent processing info
V. V. Korneyev, A. F. Gareev, S. V. Vasiutin, and others. - M .: Publisher
Mogacheva S. V .; Known Publishing House, 2001 - 496 p.
5 Russell S. Artificial Intelligence: A Modern Approach / The
settlement S., Nording P. - M .: Publishing house "Williams", 2006 - 1408 p.
12.2. Additional
6 Vagin V.N. deduction and generalization in decision-making systems
/ Vagin, V. N. - M .: Science, 1988 - 383 p.
7 V. Dyuk. “Data Processing on a PC in Examples” / V. Dyuk - SPb. :
Peter, 1997. - 240 p.
8 Zagoruiko N. G. Applied methods of data and knowledge analysis
/ N. G. Zagoruiko. - Novosibirsk: Izd. Inst. Of Mathematics SB RAS, 1999 -
270 s.
9 Callan R. Basic concepts of neural networks / R. Callan.
Per. from English - M .: Publishing house "Williams", 2001 - 292 p.
10 Kruglov V. V. Artificial neural networks: Theory and practice
/ V. V. Kruglov, V. V. Borisov. - M .: Hotline - Telecom, 2001 - 383 p.
11 Lbov G. S. Data and knowledge analysis: studies. benefit / G.S. Lbov. -
Novosibirsk: Izd. NSTU, 2001 - 90 p.
12 Romanov A.N. Counseling information systems in
Micke: textbook benefit for universities / A.N. Romanov, B.E. Odintsov. - M .:
UNITY-DANA, 2000 - 488 p.
13 Tyurin Yu. H. Statistical analysis of data on a computer
/ Yu. N. Tyurin, A. A. Makarov; by ed. V.E. Figurnov. - M .: INFRA-M,
1998 - 528 s.
14 Khaikin S. Neural networks: a full course / S. Khaikin. - 2nd ed.;
per. from English - M .: Williams. 2006 - 1204 s.
15 D. E. Hank. Business Forecasting / D. E. Hank, D.V. Wichern,
A. J. Wrights. - 7th ed., Trans. from English - M .: Williams, 2003 - 644 p.
12.3. Internet resources
16 Electronic library. - Access mode: http://all-ebooks.com.
17 Free encyclopedia [Electronic resource]. - Access mode:
http://ru.wikipedia.org/wiki/Data_mining.
18 Duke V.A. Mining Technology Application
data in the natural sciences, technical and humanitarian fields
[Electronic resource] / V. A. Dyuk, A. V. Flegontov, I.K. Fomina. Re -
access press: ftp://lib.herzen.spb.ru/text/dyuk_138_77_84.pdf.
19 Chapot M. Intellectual data analysis in systems supported by
Decision making [Electronic resource] / M. Chapot. - Dos mode
stupid: http://www.osp.ru/os/1998/01/179360/.
Comments
To leave a comment
Data mining
Terms: Data mining