Basics of using database design. Database Design Fundamentals

  • Translation

Databases are used everywhere, including in most projects in the web development world. Everything from the simplest blogs and catalogs to serious social web projects. Regardless of the complexity of the site and its associated database, each requires careful design to operate efficiently as well as reliably.


In this article, we'll cover the basics of developing a good database plan, regardless of its ultimate purpose. All database design options have a set of standard rules and best practices that should be followed. They will help the database stay organized and make it interact with the site in a smarter and more efficient way.

What functionality is required from the database?

The first method used in planning is simple brainstorming, taking notes on paper or otherwise, depending on what needs to be stored in the database and what the site will require. Try not to think about specific fields and tables that will be used in a particular case - all specific points will be discussed by you later. Your goal at this stage is to get a general and complete picture of the database structure, which you will then refine and make more detailed. It can often be more difficult to add elements to your plan later than it was initially.
Photo: binaryape
Step away from the database. Try to think about what will be required from the site? For example, if you want to create a website that brings people together, you might immediately start thinking about the data that users will store. Forget it, put it off for later. It's better to write down that users and information about them should be stored in a database. And what else? What will users do on your site? Will they publish posts, upload files, photos, write messages to each other? Therefore, the database must store all this information: records, files, photos, messages, etc.
How will users interact with your site? Will they need to search, for example, for their favorite recipes, access records available to a specific community, search for products, or see a list of recently viewed and purchased products? The database should have the ability to store recipes, “closed” records accessible to a certain circle of users, information about products, as well as the ability to connect a specific product and user.

Determining the required tables and fields

The next step is to determine which tables and fields will be needed in the database. This is the core of development and the most difficult part of it. Using the right methods for linking tables, defining the data structure in each table, identifying the need to scatter this data across different tables - all these problems arise during the actual design of the database. Now you need to define a list of obviously required tables and fields, be as specific as possible. During this process, some elements may be rebuilt or reorganized in order to improve the efficiency and security of the database.

Use a data modeling tool

Now that you know what the site will need to do, it's time to determine what specific information will need to be stored. A database design tool would be very appropriate here, especially one that has the ability to create visual database models, for example, MySQL Workbench or. Gliffy is an excellent free online tool for creating various flowcharts and database models.

There is also a more well-known, high-quality, in my opinion, tool - Microsoft Visio (Windows only, price $249.99). But don't be alarmed, there are cheaper alternatives, many of which are open-source projects, including the two mentioned above.
Familiarize yourself with the common graphical conventions and standard visual elements needed to create a database model, and begin pre-planning using flowcharts and diagrams. This will avoid logical errors before any specific database is created.

Grouping and separating data

Regarding fields, it is also important to know when to group a certain piece of data and when not to. A good way to determine what information should be in one field or vice versa is to consider whether there will be a need to change any part of it? For example, should you store an address by breaking it down into 1) street, 2) city, 3) state, 4) zip code, 5) country?
Is this an integral part of the site's functionality (perhaps users or administrators will want to search for other users by address or state), or is it simply an increase in the disk space taken up by the database? If this is not so important, then why load the database with changing 5 fields when you can update just one string field. A more convenient option may be to obtain this data from an HTML form, where the fields are separated, and before adding the address to the database, combine the values ​​from the corresponding fields into one line.
This is just one example, but always be aware of the most effective ways to organize table fields, when to combine them, when to keep them separate, to maintain site functionality.

Database Normalization

Normalization is a set of guidelines designed to help you store information more efficiently. We have already mentioned some important basic practices that are included in the most popular normal forms. There are five normal forms. It would be useful to become familiar with these normal forms and design databases according to their requirements.
Database normalization is a big topic, but already understanding the basics can help you tremendously. To have a basic understanding of each normal form and normalization in general, be sure to take a look at

State budgetary professional

educational institution

"College of Automation

and informationion technologies No. 20"

WORKING PROGRAMM

academic discipline _ OP.07 Basics of database design

specialty code/specialty 230401 INFORMATION SYSTEMS (by industry)

level of training: __ base ________

Moscow

2015

APPROVED

at a meeting of the PCC "Library Science", "IS (by Industry", "OTZI"

Protocol No. from " » ______ 2015 G.

Chairman

_____________________________/HER. Shvets/

The academic discipline program was developed in accordance with the requirements of the Federal State Educational Standard for specialty 230401 Information Systems and the curriculum

I APPROVED

Head of the educational structural unit "BTM"

_____________________________/T.I. Stenyaeva/

AGREED

Head educational and methodological department

_____________________________/S.E. Kovalenko/

"_____" ________________________20__

Developer (author): ____ Fedotkina M.V., teacher _______ _________________________________________________

Full name, position, qualification category

Reviewer:

External:______________________________________________

(Full name, place of work, position, qualification category (academic degree, title)

CONTENT

p.

  1. PASSPORT OF THE ACADEMIC DISCIPLINE PROGRAM

  1. STRUCTURE and CONTENT OF THE ACADEMIC DISCIPLINE

  1. conditions for the implementation of the academic discipline

  1. Monitoring and evaluation of the results of mastering the academic discipline

1. passport of the WORKING PROGRAM OF THE EDUCATIONAL DISCIPLINE

"OP.07 Basics of database design"

    1. Scope of application of the work program

The work program of the academic discipline is part of the main professional educational program in accordance with the Federal State Educational Standard for the specialty SVE 230401 “Information systems (by industry) (basic level) of the enlarged group of specialties 230000 Informatics and computer technology.

The work program of the academic discipline can be used in additional vocational education as an elective course for enlarged groups of specialties 230000 Informatics and Computer Science, as well as in advanced training courses for the adult unemployed population.

The academic discipline “Fundamentals of Database Design” is a general professional discipline that forms the basic level of knowledge for mastering special disciplines.

The teaching of the discipline has a practical orientation and is carried out in close connection with other general professional disciplines: “Information Technologies”, “Operating Systems and Environments”, “Computer Architecture and Computing Systems”.

1.2. The place of the academic discipline in the structure of the main professional educational program:

The academic discipline belongs to the cycle of professional disciplines to the block of general professional disciplines.

1.3. Goals and objectives of the academic discipline - requirements for the results of mastering the academic discipline:

Studying the discipline “Fundamentals of Database Design” is aimed at developing general competencies (OK 1-10) and PC 1.1, PC 1.2, PC 1.3, PC 1.7, PC 1.9. according to the Federal State Educational Standard for specialty 230401 Information systems (by industry):
OK 1. Understand the essence and social significance of your future profession, show sustained interest in it.

OK 2. Organize your own activities, choose standard methods and ways of performing professional tasks, evaluate their effectiveness and quality.

OK 3. Make decisions in standard and non-standard situations and take responsibility for them.

OK 4. Search and use information necessary for the effective performance of professional tasks, professional and personal development.

OK 5. Use information and communication technologies in professional activities.

OK 6. Work in a team and team, communicate effectively with colleagues, management, and consumers.

OK 7. Take responsibility for the work of team members (subordinates), the result of completing tasks.

OK 8. Independently determine the tasks of professional and personal development, engage in self-education, consciously plan professional development.

OK 9. To navigate the conditions of frequent changes in technology in professional activities.

OK 10. Perform military duties, including using acquired professional knowledge (for young men).

PC 1.1. Collect data to analyze the use and functioning of the information system, participate in the preparation of reporting documentation, take part in the development of project documentation for modification of the information system.

PC 1.2. Interact with related specialists when developing methods, tools and technologies for using objects of professional activity

PC 1.3. Modify individual modules of the information system in accordance with the work assignment, document the changes made.
PC 1.7. Install and configure the information system within your competence, document the results of the work.

PC 1.9. Follow regulations for updating, technical support and data recovery of the information system, work with technical documentation.

As a result of mastering the discipline, the student must

be able to:

Design a relational database;

Use a query language to programmatically retrieve information from databases;


know:

Basics of database theory;

Data models; features of the relational model and database design, visual tools used in ER modeling;

Fundamentals of relational algebra; database design principles,

Ensuring data consistency and integrity;

Database structure design tools; SQL query language
1.4. Recommended number of hours to master the approximate program of the academic discipline:

maximum student workload 168 hours, including:

student's mandatory classroom teaching load 112 hours,

independent work of the student 56 hours.

2. STRUCTURE AND SAMPLE CONTENT OF THE ACADEMIC DISCIPLINE

2.1. Scope of academic discipline and types of academic work

Type of educational work

Hours volume

168

Mandatory classroom teaching load (total)

112

including:

laboratory works

48

practical lessons

10

test papers

-

course work (project) ( if provided)

-

Independent work of the student (total)

56

including:

independent work on coursework (project) not provided

-

Preparation of a report on the topic:

- “Information technologies of the future”;

- “For which Acces objects you can create reports”;

- “The meaning of new promising directions for the development of DBMS”;

Preparing a presentation on the topic:

- “Methods of defining tables in Access”;

- “The process of creating a request - selection”;

Preparing a message on the topic:

- “Objects of the Access software (SS) and their purpose”;

- “Object – Form”;

- “DBMS direction – Postgres”;

Abstract on the topic:

- “Methods of deleting an attribute in a table”;

- “Describe the process of establishing a relationship between two tables in Acces”;

Implementation of an individual project topic:

- "Timetable of classes";

9

Final certification in the form exam

2.2. Thematic plan and content of the academic discipline op.07 FUNDAMENTALS OF DATABASE DESIGN

Name

sections and topics

Contents of educational material, laboratory and practical work, independent work

students, coursework (project)

Hours volume

Level

development

1

2

3

4

Introduction

Introduction to Database Theory

Laboratory works not provided

Practical lessons not provided

Test papersnot provided

Independent work students not provided

Prepare an abstract on the topic: “Relationship of databases with other disciplines.”

Section 1.

Database. Basic Concepts

12

Topic 1.1. Basic concepts and types of data models

Contents of educational material

6

Give the concepts of object, entity, parameter, attribute, data model. Consider the composition of the information data model.

3

DBMS and its place in the computer software system.DBMS functions. Data presentation levels.

Dialectical transition from one model to another. Three types of logic models: hierarchical, network and relational. The concept of logical and physical data independence.

Laboratory works not provided

-

-

Practical lessons not provided

Test papersnot provided

Independent work students

1. Preparation of a report on the topic “Information technologies of the future”

3

Topic 1.2. DBMS architecture

Contents of educational material

2

Database architectures (two- and three-tier structures, client - server, file - server).

Laboratory worksnot provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

2. Message on the topic: “Objects of the Access software (software) and their purpose”

1

Section 2: Database Design

114

Topic 2.1. Design concept

Contents of educational material

2

Types of enterprise data warehouse data models. Ensuring data consistency and integrity. Main stages of database development.

3

Laboratory works not provided

Practical lessons

    Domain analysis.

    Designing a conceptual database model.

    Formalization of the relational model.

6

Test papersnot provided

-

Independent work students

3. Presentation on the topic: “Ways to define tables in Access”

4

Topic 2.2. Data models. Relational data model.

Contents of educational material

6

Types of relationships in the model: “one-to-one”, “one-to-many” and “many-to-many”. Relational approach to building a data model. Features of the relational model and their impact on database design.

3

Visual tools used in ER modeling Converting a many-to-many relationship into a cross-relationship table.

Basic operations of relational algebra

Laboratory works not provided

-

Practical lessonsnot provided

Backbone-modular principle of computer construction. Internal computer architecture; processor, memory. Peripheral devices: keyboard, monitor, disk drive, mouse, printer, scanner, modem, joystick; multimedia components. Software principle of computer control. Operating system: purpose, composition, loading. Types of computer programs. The concept of a file, directory (folder) and the rules for assigning their names. File name template. The path to the file. Entering commands. Installation of programs. Working with directories and files.

-

Test papersnot provided

-

Independent work students

4. Abstract on the topic: “Ways to delete an attribute in a table.”

3

Topic 2.3. Database Design

Contents of educational material

16

Concept, purpose and principle of construction.

Indexing: concept of index, types of index files. Create, activate and delete an index. Reindexing. Sorting, searching and filtering data.

Relationships between tables: establishing and deleting. Types of keys. Methods for joining tables.

Creation of program files. Modularity of programs. Scope of variables.

Menu types. Working with menus: creation, modification, activation and deletion.

Working with windows: opening and closing windows, getting help.

Creating a screen form: properties, events and methods. Controls: properties, events and methods.

Generating and outputting reports

Laboratory works

1. Creating a database in MS Access. Creating tables.

2. Creating tables.

3. Import and export of data

4. Data import and export

5. Creating queries

6. Creating queries

7. Creating queries

8. Creating forms

9. Creating forms

10. Creating forms

11. Generating reports

12. Generating reports

13. Generating reports

14.

15. Creating the main button form

30

Practical lessons

    Database structure design.

    Normalization of tables.

4

Test papersnot provided

-

Independent work students

Implementation of an individual project topic:

    - “Organization of the work of the student library”;

    - “Organization of the work of the university printing house”;

    - “Organization of educational excursions for schoolchildren”;

    - “Organization of control over student progress”;

    - "Timetable of classes";

25

Topic 2.4.

Physical organization of data

Contents of educational material

6

Storage environment mechanisms and DBMS architecture

Structure of stored data

Types of addressing stored records. Organizing connections between stored records

Laboratory works not provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

    Report on the topic: “What Acces objects can you create reports for?”

3

Topic 2.5.

Relational Database Management

Contents of educational material

4

Data management is the basis of database administration. Basic concept of data management.

Organization of data management. Database administration.

Laboratory works not provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

    Message on the topic: “Object – Form”

2

Section 3.

Database languages

14

Topic 3.1

Language SQL

Contents of educational material

6

SQL Query Language

SQL query language commands for modification: creating a database file, creating a table, adding, editing and deleting records.

Request for data selection: data selection from one table or several tables, with sorting and grouping of data, with a condition for selecting records (filtering).

Laboratory works

16. Creating SQL Queries

17. Creating SQL Queries

18. SQL queries

19. SQL queries

8

Practical lessonsnot provided

-

Test papersnot provided

Independent work students

    Presentation on the topic: “The process of creating a request - sampling”

7

Section 4: Using the Database

30

Topic 4.1.

Ensuring the functioning of databases

Contents of educational material

4

Organization of the database management system

Generalized technology for working with databases

Laboratory worksnot provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

    Abstract on the topic: “Describe the process of establishing a relationship between two tables in Acces"

2

Topic 4.2. New database technologies

Contents of educational material

6

Modern information technologies – monitoring of information resources;

Application of case technologies for designing databases and applications;

Modern information technologies – dissemination of data with the widespread use of Web technologies. GIS for data visualization and creation of electronic reference guides.

Laboratory worksnot provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

    Report on the topic: “The meaning of new promising directions for the development of DBMS”

3

Topic 4.3.

Modern DBMS

Contents of educational material

4

Multi-platform DBMS. DBMS targeted at specific platforms.

DBMS of the XBase, Dbase family. Prospects for the development of databases and DBMS

Laboratory worksnot provided

-

Practical lessonsnot provided

Test papersnot provided

Independent work students

    Message on the topic: “DBMS direction – Postgres”

2

Total:

168

3. conditions for the implementation of the academic discipline

3.1. Minimum logistics requirements

The implementation of the academic discipline requires the presence of a classroom for computer science, mathematics and computer science.

Classroom equipment:

    List of main equipment:

    network computer class with Internet access;

    seating according to the number of students;

    cabinets for methodological literature;

    information stands.

Technical training aids:

    interactive whiteboard - Interwrite;

    projector -Epson;

    computer workstation for the teacher;

    Printer-HP Deskjet 1280;

    Scanner-Epson perfection v200 PHOTO.

Description of equipment at the workplace:

Processor type Intel® Core™ i5-2400

Processor with a clock frequency of 3.10Ghz;

RAM 4.0 GB ;

HDD 2 Tb;

Acoustic system – Genius;

    operating system - Windows 7x 32;

    antivirus program -Microsoft security Essentials;

    Archiver program-Winrar;

    office software: word processor, spreadsheet processor, program for creating multimedia presentations - Microsoft office 2007;

    database management system - Microsoft office 2007;

    integrated software development environment - Microsoft office 2007;

    visual design system - Microsoft office 2007.

3.2. Information support for training

Main sources:

    Matrosov V.L., Zhdanov S.A., Soboleva M.L. Information systems in structural logical diagrams.-M.:MPGU, 2014.-105p.

    Fufaev E.V., Fufaev D.E. Databases-M.: “Academy”, 2011-320p.

Additional sources:

    Matrosov V.L., Zhdanov S.A., Ivanova N.Yu., Manyakhina V.G., Kostin A.N. Informatics-M.: “Academy”, 2012-336 p.

4. Monitoring and evaluation of the results of mastering the ACADEMIC DISCIPLINE

Control and evaluation the results of mastering the academic discipline are carried out by the teacher in the process of conducting laboratory work, testing, as well as students completing individual assignments and coursework.

Learning outcomes

(mastered skills, acquired knowledge)

Forms and methods of monitoring and assessing learning outcomes

1

2

Skills:

design a relational database;

use SQL language to programmatically retrieve information from databases.

Combined: laboratory workshop, abstracts (reports), reports on laboratory workshop.

Knowledge:

basics of database theory;

Group: abstracts (reports)

data models;

features of the relational model and their impact database design,

Group: abstracts (reports).

visual tools used in ER modeling;

Group: abstracts (reports).

fundamentals of relational algebra;

Group: abstracts (reports).

database design principles,

Group: abstracts (reports).

ensuring data consistency and integrity;

Group: abstracts (reports)

database structure design tools;

Group: abstracts (reports)

SQL query language

Group: abstracts (reports).

7.1. Database Design Fundamentals

The developed functional model of the system answers the questions “What should the system do?” and “Through what actions can the required result be achieved?” This model also allows you to conceptually define the data sets used in the system.

At the same time, it does not answer the question “How is the data organized in the system?” To answer it, it is necessary to build an information model (design a database).

Entity (table, in RDB – relation) – a set (class) of similar real or imaginary objects that are essential for the subject area under consideration, information about which is subject to storage. Examples of entities: employee, part, statement, exam results, etc.

Entity instance (record, string, in RDB – tuple) – a uniquely identifiable object.

Connection – some association between two entities that is significant for the subject area under consideration. Examples of connections can be family relationships “father-son”, production relationships - “superior-subordinate” or arbitrary ones - “to own”, “to have property”.

Attribute (column, field) – property of an entity or connection.

Most modern data modeling typically supports several graphical notations for building information models. In particular, the ERwin system from Computer Associates supports two notations: and (English: Information Engineering - information design). These notations are one-to-one, i.e. the transition from one notation to another and back is performed without loss of model quality. The difference between them lies only in the form of display of model elements.

When using any, a logical database diagram is first constructed in the form of a diagram indicating the entities and connections between them. Logic circuit is called a universal description of the data structure, independent of the final database implementation and hardware platform. Based on the obtained logical circuit, they proceed to the physical data circuit. Physical diagram is a diagram containing all the necessary information to generate a database for a specific DBMS or even a specific version of the DBMS. If in the logical diagram it does not matter what identifiers the tables and attributes have, the data type of the attributes, etc., then the physical diagram must contain a complete description of the database in accordance with the syntax adopted in it, indicating the types of attributes, stored procedures, etc. .d. Using the same logical circuit, you can create several physical ones. For example, ERwin v9.2 allows, based on a logical diagram, to create physical DBMSs for more than 10 industrial DBMSs (ORACLE, MySQL, DB2, MS SQL Server, etc.) and their various versions. Based on the physical schema, you can generate either the database itself or a DDL script 1, which, in turn, can be used to generate the database.

The procedure listed above is called direct database design (Forward Engineering DB) . allow you to also perform reverse engineering DB , i.e. based on the database system catalog or DDL script, build a physical and, further, logical data diagram.

In addition to forward and reverse engineering modes, CASE tools usually support synchronization between the schema and the database system directory, i.e. when changing the schema, they can automatically make all the necessary changes to the existing database and vice versa.

Advanced CASE tools also have built-in subsystem for searching and correcting errors in the circuit . This feature is especially useful when designing large databases containing tens or hundreds of tables, as well as when reverse engineering.

It should be noted that modern DBMSs have their own built-in tools for visual data modeling. Some of them even support classic ERD notations. The disadvantages of such modeling are the construction of only a physical data schema and the impossibility of quickly switching to another DBMS if such a decision is made. The advantage of this approach is a more complete use of the potential of the DBMS, because DBMS developers know its features and capabilities better than others.

The following describes the forward design procedure using the IDEF1X methodology. The IDEF1 methodology was developed by T. Ramey. Currently, based on IDEF1, a new version has been created - the IDEF1X methodology, which was adopted by ICAM as a US federal standard in 1981.

1 Data Definition Language – data definition language, a subset of the SQL language.

The essence of database design, like any other design process, is to create a description of a new system that has not previously existed in this form, which, when implemented, is capable of expectedly functioning under appropriate conditions. It follows from this that the stages of database design must consistently and logically reflect the essence of this process.

Contents of database design and phasing

The design intent is based on some formulated social need. This need has an environment for its occurrence and a target audience of consumers who will use the design result. Consequently, the database design process begins with studying a given need from the point of view of consumers and the functional environment of its intended placement. That is, the first stage is collecting information and defining a model of the system’s subject area, as well as a look at it from the point of view of the target audience. In general, to determine system requirements, the scope of activities as well as the boundaries of database applications are determined.

Next, the designer, who already has certain ideas about what he needs to create, clarifies the tasks supposedly solved by the application, creates a list of them (especially if the project development is a large and complex database), clarifies the sequence of solving problems and performs data analysis. Such a process is also a staged design work, but usually in the design structure these steps are absorbed by the conceptual design stage - the stage of identifying objects, attributes, and connections.

Creating a conceptual (information model) involves the preliminary formation of conceptual user requirements, including requirements for applications that may not be immediately implemented, but taking into account which will improve the functionality of the system in the future. Dealing with representations of set abstraction objects (without specifying physical storage methods) and their relationships, the conceptual model essentially corresponds to the domain model. Therefore, in the literature, the first stage of database design is called infological design.

Next, a separate stage (or an addition to the previous one) follows the stage of forming requirements for the operating environment, where the requirements for computing resources capable of ensuring the functioning of the system are assessed. Accordingly, the larger the volume of the designed database, the higher the user activity and intensity of requests, the higher the requirements for resources: for the computer configuration, for the type and version of the operating system. For example, multi-user operation of a future database requires a network connection using an operating system suitable for multitasking.

The next step is for the designer to select a database management system (DBMS), as well as software tools. After this, the conceptual model must be transferred to a data model compatible with the selected management system. But often this involves making amendments and changes to the conceptual model, since the interconnections between objects reflected in the conceptual model cannot always be implemented using the means of a given DBMS.

This circumstance determines the emergence of the next stage - the emergence of a conceptual model provided with the means of a specific DBMS. This step corresponds to the stage of logical design (creating a logical model).

Finally, the final stage of database design is physical design - the stage of linking the logical structure and the physical storage environment.

Thus, the main stages of design in detailed form are presented in the following stages:

  • information design,
  • formation of requirements for the operating environment
  • selection of control system and database software,
  • logical design,
  • physical design

The key ones will be discussed in more detail below.

Infological design

Identification of entities forms the semantic basis of infological design. An entity here is an object (abstract or concrete), information about which will be accumulated in the system. In the infological model of the subject area, the structure and dynamic properties of the subject area are described in user-friendly terms that do not depend on the specific implementation of the database. But the terms are taken on a standard scale. That is, the description is expressed not through individual objects of the subject area and their relationships, but through:

  • description of object types,
  • integrity constraints associated with the described type,
  • processes leading to the evolution of a subject area - its transition to another state.

An information model can be created using several methods and approaches:

  1. The functional approach is based on the assigned tasks. It is called functional because it is used if the functions and tasks of the persons who will serve their information needs with the help of the designed database are known.
  2. The subject approach focuses on information about the information that will be contained in the database, despite the fact that the query structure may not be defined. In this case, research on a subject area focuses on its most adequate display in the database in the context of the full range of expected information requests.
  3. An integrated approach using the “entity-relationship” method combines the advantages of the previous two. The method comes down to dividing the entire subject area into local parts, which are modeled separately and then recombined into a whole area.

Since the use of the “entity-relationship” method is a combined design method at this stage, it most often becomes a priority.

When methodically divided, local representations should, if possible, include information that would be sufficient to solve a separate problem or to meet the requests of a certain group of potential users. Each of these areas contains about 6-7 entities and corresponds to a separate external application.

The dependence of entities is reflected in their division into strong (base, parent) and weak (child). A strong entity (for example, a reader in a library) can exist in the database on its own, but a weak entity (for example, this reader’s subscription) is “attached” to a strong one and does not exist separately.

It is necessary to separate the concepts of “entity instance” (an object characterized by specific property values) and the concept of “entity type” - an object characterized by a common name and a list of properties.

For each individual entity, attributes (a set of properties) are selected, which, depending on the criterion, can be:

  • identifying (with a unique value for entities of that type, making them potential keys) or descriptive;
  • single-valued or multi-valued (with the appropriate number of values ​​for an entity instance);
  • basic (independent of other attributes) or derived (calculated based on the values ​​of other attributes);
  • simple (indivisible one-component) or composite (combined from several components).

After this, the attribute is specified, the connections are specified in the local view (divided into optional and mandatory) and the local views are merged. If the number of local areas is up to 4-5, they can be combined in one step. If the number increases, the binary merging of areas occurs in several stages.

During this and other intermediate stages, the iterative nature of design is reflected, which is expressed here in the fact that in order to eliminate contradictions it is necessary to return to the stage of modeling local representations for clarification and change (for example, to change the same names of semantically different objects or to coordinate integrity attributes on same attributes in different applications).

Selecting a control system and database software

The practical implementation of the information system depends on the choice of the database management system. The most significant criteria in the selection process are the following parameters:

  • type of data model and its compliance with the needs of the subject area,
  • reserve of possibilities in case of expansion of the information system,
  • performance characteristics of the selected system,
  • operational reliability and convenience of the DBMS,
  • tools aimed at data administration personnel,
  • the cost of the DBMS itself and additional software.

Errors in choosing a DBMS will almost certainly subsequently provoke the need to adjust the conceptual and logical models.

Logical database design

The logical structure of the database must correspond to the logical model of the subject area and take into account the connection of the data model with the supported DBMS. Therefore, the stage begins with choosing a data model, where it is important to take into account its simplicity and clarity.

It is preferable when the natural data structure coincides with the model representing it. So, for example, if the data is presented in the form of a hierarchical structure, then it is better to choose a hierarchical model. However, in practice, such a choice is often determined by the database management system rather than by the data model. Therefore, the conceptual model is actually translated into a data model that is compatible with the selected database management system.

This also reflects the nature of design, which allows for the possibility (or necessity) of returning to the conceptual model to change it if the relationships between objects (or object attributes) reflected there cannot be implemented using the chosen DBMS.

Upon completion of the stage, database schemas of both levels of architecture (conceptual and external) should be generated, created in the data definition language supported by the selected DBMS.

Database schemas are formed using one of two different approaches:

  • or using a bottom-up approach, when work comes from the lower levels of defining attributes, grouped into relationships representing objects, based on the relationships existing between attributes;
  • or using a reverse, top-down approach, used when the number of attributes increases significantly (up to hundreds and thousands).

The second approach involves identifying a number of high-level entities and their relationships with subsequent detailing to the required level, which is reflected, for example, in a model created based on the “entity-relationship” method. But in practice, both approaches are usually combined.

Physical database design

At the next stage of the physical design of the database, the logical structure is displayed in the form of a database storage structure, that is, it is linked to the physical storage environment where the data will be placed as efficiently as possible. Here the data schema is described in detail, indicating all types, fields, sizes and restrictions. In addition to developing indexes and tables, basic queries are defined.

The construction of a physical model involves solving largely contradictory problems:

  1. tasks of minimizing data storage space,
  2. challenges to achieve integrity, security and maximum performance.

The second task conflicts with the first because, for example:

  • for transactions to function effectively, you need to reserve disk space for temporary objects,
  • to increase search speed, you need to create indexes, the number of which is determined by the number of all possible combinations of fields involved in the search,
  • To restore data, database backups will be created and a log of all changes will be kept.

All this increases the size of the database, so the designer is looking for a reasonable balance in which problems are solved optimally by intelligently placing data in memory space, but not at the expense of database security, which includes both protection from unauthorized access and protection from failures.

To complete the creation of a physical model, its operational characteristics are assessed (search speed, efficiency of query execution and resource consumption, correctness of operations). Sometimes this stage, like the stages of database implementation, testing and optimization, as well as maintenance and operation, is taken outside the immediate design of the database.

Currently, human life is so saturated with information that to manage it, it is necessary to create databases and data banks used in various fields of activity. Data processing has evolved from the primitive methods of the 50s to the complex integrated systems of today.

Basic principles of relational database design

Data models are a certain abstraction that, when applied to specific data, allows users and developers to treat them as information, that is, information containing not only data, but also the relationship between them.

The following basic data models exist:

Models based on inverted lists - a database organized using inverted lists is built in such a way that tables and access paths to them are visible to users, while the rows of the tables are physically ordered in some sequence.

Hierarchical data models - a database based on a hierarchical model consists of an ordered set of trees. Each tree has a single "root" and an ordered set of zero or more related subtrees (descendants). The integrity of the connection between them is maintained automatically.

In a database with a network structure, these subtrees can have any number of root ones. In fact, a network database consists of a set of records between these records.

Currently, most databases use relational data models. The relational model is a special method of viewing data that contains data (in the form of tables) and ways of working and manipulating them (in the form of relationships). The relational model assumes three conceptual elements: structure, integrity, and data processing.

A table in a relational database is viewed as the immediate “store” of data. Traditionally, in relational schemas, a table is called a relation. A table row is called a tuple, and a column is called an attribute. In this case, attributes have unique (within the relation) names. The number of tuples is called the cardinal number, and the number of attributes is called the power. A relation is provided with an identifier, that is, one of several attributes whose values ​​are not the same at the same time - the identifier is called the primary key.

A domain is a set of valid homogeneous values ​​for a particular attribute. Thus, a domain can be considered as a named set of data, and a component of this set are logically indivisible units (a domain can be, for example, a list of names of employees of an institution, but not all names can be present in the table).

A relation contains two parts - a header and its own content part. The header contains a finite set of attributes, and the content part (the body of the relationship) contains the set of attribute name and its value.

In a relational database, unlike other models, the user specifies what data is needed, and not how to do it. The formal basis of the relational database model is relational algebra, based on set theory and considering special operators over relations, and relational calculus, based on mathematical logic.

There are many approaches to defining relational algebra, which differ in the set of operations and the way they are interpreted. According to Codd, the set of algebraic operations consists of eight main ones:

  • 1. Sampling attitude;
  • 2. Projection of attitude;
  • 3. Consolidation of relations;
  • 4. Intersection of relationships;
  • 5. Subtraction of ratios;
  • 6. Product of relations;
  • 7. Connecting relationships;
  • 8. Division of relationships;

In addition to the above, there are a number of special operations characteristic of working with a database: as a result of the “renaming” operation, a relation is obtained whose set of tuples coincides with the body of the original relation, but the attribute names have been changed. The “assignment” operation allows you to save the result of calculating a relational expression in an existing database relation. It follows that if the result of a relational operation is a certain relation, then it is possible to form relational expressions in which an embedded relational expression will be used instead of the original relation (operand relation). This occurs due to the fact that the operations of relational algebra are truly closed with respect to the concept of relation.

One of the main requirements for organizing a relational database is to provide the ability to search for some cards based on the possible values ​​of others, for which it is necessary to establish a connection between them.

Connection- this is a functional dependence between two entities (an entity can be connected to itself). If a relationship exists between entities, then instances of the relationship of one entity refer to or are in some way related to instances of another.

At the logical level, connections can be made:

  • 1. one-to-one;
  • 2. one-to-many;
  • 3. many-to-many;
  • 4. many-to-one;

Stages of physical implementation of the designed database

Implementation is the stage of turning a conceptual model into a functioning database. Implementation includes the following steps:

  • 1. Selection and purchase of a DBMS.
  • 2. Convert the conceptual model into a physical model.
  • 3. Building a dictionary.
  • 4. Filling the database.
  • 5. Creation of application programs.
  • 6. User training.