|
Smart Search and Selection System
|
Limitations of usual search and selection systems
Our approach
Setting up new search and selection system
Proposed commercial products
Verification of new principles
Each application which contains data about a lot of objects should have embedded
subsystem, which allows user to search data and select objects which satisfy some
selection criteria. Examples of such applications are system for processing
commodities related data (in particular, e-commerce applications), job search
sites, computer versions of reference book systems, encyclopedias and so on.
Often happens that objects, stored in database, are very different in their
nature and therefore different kinds of these objects have different sets of
meaningful properties. For example, in any shop there are great variety of
commodities and total list of their properties, which are meaningful at least
for some items, easy can reach 100-1000 names of properties. Meanwhile, almost
any property is meaningful only to some little part of whole assortment of
commodities. Say, property "Calories per 100g of product" is meaningful only
for food, "Color" is meaningful for clothes, footwear, paints, whine and some
other groups of commodities.
When we say that a property is meaningful for an object, we mean not only
possibility to apply this property to the object, but also importance of this
property of the object in the application. Thus, property "weight" is meaningful
for books, but it has no value for customer, searching for a certain book in the
library or in a shop.
No wonder, that in all applications mentioned above only very limited number of
properties were stored in database fields, which are available for search
subsystems. If it is necessary to enter some additional properties (beside
this short list), memo or blob fields are used. Therefore these applications
can store any additional information about each object - but only in the form
of texts or pictures.
Consequence of this traditional approach is as follows: any property, mentioned
only in text of memo fields, becomes unavailable for search subsystem.
More exactly, user can search only for some words or expressions in texts
of descriptions.
But even very advanced methods of text search failed to provide many possibilities,
available for search with usage of common database fields.
Consider an example. Suppose that we wish find in e-shop brown or gray wool
men's jacket with no less than four pockets with price between $50 and $150.
Usually search system proposes us to narrow (consecutively or simultaneously)
the kind of commodity and price range. It is possible that search system propose
us to indicate some other search criteria (e.g. manufacturer name), but nothing
can help us in narrowing selection as high as to number of pockets or even color.
If we will search in texts of descriptions of commodities, using words "brown"
and "gray", we would receive a list, which includes jackets with "yellow and
brown strips" or "gray silk lining". And there is no way at all to select
jacket with no less than four pockets.
Another disadvantage of existing search and selections systems is that they often
allows user to enter meaningless quires and performs database search in vain. For
example, in one of furniture e-shop I tried to find glass (material) bed (kind of
furniture) for kitchen (furniture for…) and after approximately ten seconds obtained
apology that currently they are unavailable – together with the list of some others
e-shops for search.
No doubt, most search and selection systems used now are sufficient for many cases.
But let suppose that you use e-shop of new and second hand cars, and wish to find
in San Francisco or Los Angeles second hand luxury or sports car of any make
besides “Volkswagen” and “Fiat”, manufactured in USA or Japan after 1995, of
any color besides red and yellow, with automatic speed transmission and air
conditioner, but without sunroof and heated driver seat. Evidently, none of
now existing e-commerce systems can provide you adequate selection.
Mentioned above limitations of search and selection systems are caused by some
fundamental limitations of relational databases, usually used in these
applications. Of course, there is no ways to overcome these database
limitations, but they are evaded in proposed approach. This brand new
approach allows develop search and select system with the following qualities.
-
Any useful for search and selection property of any object can be stored in
usual field of relational database table, therefore any property can be used for
search. Though for each given object major part of all properties is meaningless,
database does not include any vast space reserved for values of meaningless
properties.
-
Properties, included in the system and available for search, are of great variety.
They can be not only numeric (like "Weight"), logical (like "For men or for women"),
chronological ("Date of publication") or selected from some list ("Color"), but
also can have a set of some elements as its value ("ingredients of food product",
"Operating systems, where installation of software is possible").
-
During entering data in search system database by operator, and during entering
selection mask by end user, there is possibility to enter either exact or
approximate value of each meaningful property of any object. Also, even some
meaningful properties of some objects can remind undefined.
The last feature is necessary for entering information in database in some
cases (e.g. birth dates of historical persons in reference book, technical
characteristics of second hand cars, etc.).
On the other hand, usage of approximate values during search by end user
allows him more freely express his requirements. For example, user can
indicate not only "Fabric = wool" but "Fabric = wool, tweed or leather
but not velveteen or blue denim". This possibility is available, even if
value of property is a set of items from some list. It is possible to
indicate that "The list of product ingredients should include apple juice,
plump juice and contain no sugar and strawberry juice".
-
Due to possibility of undefined and approximate values of object properties,
there are two options of search and selection:
- Narrow selection. Object selected, if it for sure satisfy conditions
of search mask.
- Wide selection. Object selected, if it can satisfy conditions of
search mask (but it is not known for sure, because exact values of some
properties of object are unknown).
-
During entering data in search system database by operator (and in selection
mask by end user) he is prompted consequently (but
in arbitrary order) enter values of some attributes of object (or for
search mask). But after entering value of any property, the list of
properties, offered to the operator or end user, will increase by
including those properties, which are meaningful for all objects,
satisfying previously entered values or restrictions.
Hence, end user never has a possibility to enter value of property,
which is meaningful for objects he is seeking for.
For example, if we previously selected that we seek for
"Commodity type = clothes" "For men or women = men", we
will not have "Number of pockets" in prompted list of properties for future
selections. It is because there are lot articles of men's clothes, which
never have pockets at all - like necktie or socks. But if we indicate that
"Kind of man clothes = jacket", this property will appear.
Similarly, while filling up database with values of object properties,
operator always prompted to enter values only for properties, which for
sure are meaningful for processing object.
Note, that this last quality of search system holds for some existing search
systems (e.g. selections of regions and cities in weather sites; search for
sites by categories in internet search sites, etc.), but only for cases when
structure of search questions tree is invariable. Our approach for developing
search systems allows change structure of search questions tree "on the fly".
-
In all cases (besides entering numeric values) user indicates values of
properties in search mask only by selection from appropriate lists or by clicking
on one or several list items.
During entering data in search system database operator also should use
keyboard only for entering numeric values and in cases, when it is necessary
correct or add new item to some selection list.
-
Proposed approach can be combined with other technique in order to provide
all positive features of used now advanced search and selection systems.
In particular, it can be combined with categorization and classified counting
of selected items. Evidently, in spite of essential enlarging
the set of object properties, used for search and selection, text descriptions
of objects in memo fields as well as images of objects also preserve important.
-
Peculiarities of new approach completely match to 3-layer principles of
applications development. All intelligent work of search and selection
can be separated in business rules layer on the server side.
Of course, for providing advanced and convenient search for the end user, some additional work should to be done to prepare data. This work consists of two parts.
First, it is necessary to make (by specially developed application) description of structure of all used attributes and
relations between them. Result of this work is a separate database (we call it "Properties structure"). Regardless of
friendly and simple for learning interface of application, this work could be done only by a person, who is well familiar
with possible objects and spectrum of their properties. For example, for describing properties for search system in
biological encyclopedia a biologist expert required, while for search system in e-commerce it is necessary use some
commodity researcher expert.
Second, somebody should enter and regularly update all meaningful (and essential for search) properties of all objects,
which are subject to the search. This work can be done (by another special application, using database with property
structure) by any person and requires no specific qualification.
Process of entering values of properties for each object
is very like to filling up selection mask by end user. The application always prompts to enter value only for meaningful properties,
using previously entered values for this object, Also, all values (besides numeric) can be selected from the lists. The
only difference is that in this case it is possible to make changes in lists from which property values are selected. If
these changes can affect the structure of properties, then application will ask some questions about meaningfulness of
some properties in some cases and make necessary corrections in the structure of properties.
If necessary, this person can do changes in properties structure and add new properties. But it is much easy, than
initial creation of properties structure.
Our search and selection system based on three applications:
-
Application for creating database of properties structure for some wide class of objects.
-
Application for entering values of properties of objects.
-
Application for performing the search itself.
Commercial product can be delivered in two versions:
-
All three applications
-
Two last applications and one or several databases with properties structure for some realm of objects.
For example, big clothes store can buy for its reference service two last systems and database with property structure for clothes. While entering data for their own commodities, they will inevitably correct the database with property structure - but they avoid creation of this database from the scratch.
Only two last application use table with list of all objects, and connection between their internal database and database with list of objects is very simple. It allows develop search system applications as independent products, which can be easy embedded in almost any application, containing relational database with table of objects of any kind.
In order to test new approach of search and selection, special
application that embed functionality of all three systems, was developed.
The purpose of this application was to verify new principles that constitute
a basis of new approach and gain some experience in order to improve
functionality, select appropriate additional service functions and find most
convenient user interface. That is why it does not include some appropriate
service functions and even help system.
Test, done with this application showed that new principles of search and
selections are correct and actually provide essentially more advanced
possibilities for end users.
Several screenshots below gives some notion how will increase list of available
properties while the end user fill up selection mask, and how user enter
approximate values (even in case, when value of some property is a set
of elements from some list).
Keep in mind that our purpose was to verify new principles of selection
for set of very different kinds of objects and wide spectrum of properties,
mostly meaningful only for some of objects. Therefore we included in list of
objects very different kinds of items, which hardly could be actually put
together in a single database.
The first property, proposed to the end user, should be meaningful for all objects
While user enter selection masks for some properties, the list of displayed properties expands
The value of property “Ingredients of canned food product” is a set of items from list of ingredients.
Nevertheless, user can enter approximate value of this property, even in
several ways. On screenshot he checked items, which should (or should not) be
ingredients of searched product, using mouse clicks and marking lines by "+" or "–".
User selected “Country of origin”, indicating that all countries besides Brazil and China satisfies him.
Approximate value for property with numeric value can include upper and (or) lower bounds
After performing selection (wide or narrow) user received a list of selected
objects and can see all their meaningful properties. Of course, list of
these properties is different for different selected objects. For some
objects it can contains properties, which where not even displayed when
user filled selection mask.
In some cases value of property of selected object could not be displayed
in short line, therefore user always has a possibility to see this value
described in plain English in a separate window