레이블이 thesaurus인 게시물을 표시합니다. 모든 게시물 표시
레이블이 thesaurus인 게시물을 표시합니다. 모든 게시물 표시

2010년 8월 13일 금요일

[Dissertation]A Study on Developing a Faceted Classification Scheme Integrated with a Thesaurus for Literature

Yonsei University, Dept. of Library and Information Science
Dissertation “A Study on Developing a Faceted Classification Scheme Integrated with a Thesaurus for Literature”
Adviser Taesoo Kim [Link]

ABSTARCT
The purpose of this study is to develop a faceted classification scheme with a thesaurus (FCT) to more effectively organize documents by subject matter than a legacy classification scheme, namely, the Korean Decimal Classification (KDC). For achieving this purpose, the requirements of the new scheme are as follows:

1) improving the subject representation power of a classification scheme to effectively represent multidimensional subjects, such as compound and complex subjects
2) representing the characteristics of division applied to the classification scheme
3) specifying the conceptual level of the classification scheme to be suitable for various information resources

In Facet analysis, knowledge structure is analyzed into a multidimensional aspect called a “facet” that provided a device to represent subjects in a logical and detailed method through a facet and phase relation. Furthermore, by linking a thesaurus to the classification scheme, it was possible to share facets and expand the conceptual level of headings through the thesaurus descriptors.

The KDC is selected as the base scheme for developing the FCT and the National Library Subject Headings (NLSH) is used for the linked thesaurus. The scope of the classification scheme is literature because that particular field of study was suitable for applying the facet analysis and link with a thesaurus, but it is not properly treated in the enumerative classification scheme, KDC. Mixed notation is used instead of pure notation to improve the representative power of the classification number for the arrangement and display of the information materials, as well as expanding the search terms during information retrieval.

As a result, the FCT includes a classification rule, facets, a chain index, and a linked thesaurus. The classification rule provides usage of notation system for representing the comprehensive structure of the FCT and subject relations. In addition, six facets, including discipline, language, place, period, person, and form facet, were derived for literature classification and each facet has its own facet indicator and symbol. The features of the new classification scheme are as follows:

1) It is possible to represent compound subjects more clearly through facet relation and to represent complex subjects that can not be expressed in the KDC through phase relation.
2) It is possible to change citation order and transform the structure of a classification scheme or browsing method, unlike the rigidly structured hierarchy in the KDC.
3) It is possible to synthesize the class number without determining the main class because disciplines are treated as a facet (not a basic class), and it is simple to add new classification numbers to the FCT.
4) Mnemonics is improved using facet indicator, facet symbol, relation indicator separately unlike KDC that has no indicator except ‘0’.
5) It is possible to treat an ambiguous subject that is difficult to handle as a facet or phase relation by using subject device.
6) The chain index and thesaurus descriptors linked with facets are provided instead of the relative index in the KDC.

The FCT is limited in that it is developed only for the field of literature and has a complicated notational system. However, the basic framework, such as facet analysis, thesaurus linking, and classification rules, can be applied to other areas, too. Additionally, although the notational system can initially be considered a bit complicated, its representational power is more important than simplicity because this is a criterion for the intellectual and physical arrangement and relative position of the materials. Furthermore, users recognize the subjects of documents through the terms, not the notation, and use a document subject itself, rather than the number, in the search results.

In the future, the proposed classification scheme needs to be applied to additional areas of study and modified according to the results. A follow-up action for developing the classification scheme management system is also desirable.

Key Words : classification scheme, Colon Classification(CC), facet analysis, facet relationship, Korean Decimal Classification(KDC), National Library Subject Headings(NLSH), phase relationship, subject headings, thesaurus

2010년 5월 19일 수요일

[tool]Thesaurus construction tool(시소러스 구축도구)

시소러스 구축도구를 찾아보았다.

* 프로그램
1.[무료]TheW32 : Tim Craven - Freeware
2. [상용]MultiTes: Thesaurus Construction and Publishing Solutions

* 시소러스 구축도구 리스트
1. Software for building and editing thesauri - 좀더 살펴볼 것
2. Thesaurus Management Software by American Society for Indexing - 좀더 살펴볼 것

2008년 1월 17일 목요일

[Article]An Experimental Study on the Construction of Multidimensional thesaurus

[Article in Journal of Knowledge Processing and Management]
다차원 시소러스 구축에 관한 실험적 연구▶ full text(PDF)
박지영, 김태수

ABSTRACT
The Purpose of this study is to construct a multidimensional thesaurus basedon the concept definition and facet classification. The subject field of thisthesaurus is zymurgy, specifically beer brewage, since brewing words are soconcrete that they can be analyzed more precisely within their characteristics. Theconcept was analyzed for conceptual modeling, according to the internationalstandard(ISO 704: 2000(E)) and categorized into the basic categories, facets, andisolated by colon classification. After these process, a terminological database wasconstructed and characteristics were manipulated in order to sort and representthe conceptual relationships. By sorting or categorizing the characteristics in theterminology database with various criteria, we can dynamically show thehierarchical structures and conceptual relationships. This enables us to assignthe concepts to various categories according to their characteristics and constructa multidimensional concept system and reduce the confusion within the complexconceptual relationships. Moreover we can transform the representation of theconcept system according to the purposes or needs of the thesaurus user.

keyword : multidimensional thesaurus, subject indexing, knowledge organization,facet analysis, terminology

초록
본 연구의 목적은 개념정의를 이용해 용어가 가지는 속성을 추출하고, 패싯분류 체계를 이용해 디스크립터의 범주를 표현할 수 있는 시소러스를 구축하기 위한 것이다. 구축 과정에서는개념의 범위를 명확히 하기 위해 대상 용어를 양조학 분야로 한정했으며, 이 중에서도 일상생활에 친숙한 맥주 용어를 디스크립터후보로 선정하였다. 개념 분석 모델로는 국제표준인ISO 704:2000(E)를 이용하였고, 패싯분류에서는 기본패싯과 하위 구분지를 적용시켰으며, 분류된 데이터는 용어데이터베이스로 구축한 뒤 웹에서 검색할 수 있도록 만들었다. 따라서 본 시소러스는 각 용어마다 다른 언어형식으로 표현된 개념을 정규화시켜 제시함으로써 개념 간의 관계 설정이 가능하도록했으며, 각 디스크립터를 기본패싯과 구분지로 묶어서 각 개념의 계층관계를 밝힐 수 있고, 시소러스 사용자의 관점에 따라 적합한 상위어와 하위어 집단을 수시로 변경해서 추출할 수 있다. 따라서 시소러스를 사용하는 목적이나 필요성에 따라 변형되는 개념체계를 다차원적으로 표현할 수 있는 장점을 갖는다.

키워드
다차원 시소러스, 주제 색인, 지식의 구조화, 패싯분류, 전문용어