Character
count steps
1.
Start from the Chinese Text Project homepage
2.
Click on the title of the text you wish to
count, as listed on that page:
a.
莊子 - Zhuangzi
b.
論語 - The Analects
c.
孟子 - Mengzi
d.
荀子 - Xunzi
e.
道德經 - Dao De Jing
f.
楚辭 - Chu Ci
3.
Here things get tricky. Some entries, like 道德經 - Dao De Jing, take you straight to the text; in
effect, as a single chapter. Most of the others, like 論語 - The Analects, have the text broken down into separate
chapters. Some of the others, like 莊子 - Zhuangzi and 楚辭 - Chu Ci, have the chapters organized into subgroups.
4.
Once we get to the chapter text, such as 逍遙遊 - Enjoyment in Untroubled Ease, there are some
extraneous words and characters on the page (chapter titles repeated at the
beginning of each section, English translations, some footnotes, etc.). All we
are really interested in counting is the block of Chinese text on the lower
right. So, if it is possible to go simply to that area, that might be simplest;
but if we can’t avoid it, I don’t think including the others will make a
difference in the outcome since they won’t appear in the Full Text Search.
5.
Take the first character. For example, in 逍遙遊 - Enjoyment in Untroubled Ease, it is 北.
6.
Check to see if it is Chinese or
English. If it is English, go on to the next character. (This step may not be
necessary. Some of the texts don’t include translations. And even if they do,
searching for English words or letters will only result in empty searches.)
7.
If it is Chinese, add 1 to the count
of Total Characters in the book.
8.
Check to see if we have looked at
this character in this book before. If so, move on to the next character and
return to step 6.
9.
Add one to the count of Distinct Characters.
10. Return
to the home page of the text we are searching (in this case, Zhuangzi).
11. Enter the
character (in this case, 北) into the
Full Text Search on the right.
12. Check the
number Matched on top. (39 in the case of 北.)
13. If the number is
1, add 1 to the count of Unique Characters and 1 to the count of Rare
Characters and add this character to the Unique Characters List.
14. If the number is
more than one but less than or equal to five, add 1 to the count of Rare
Characters and add this character to the Rare Characters List.
15. Return to the Chinese Text Project homepage.
16. Enter the
character into the Full Text Search on the right.
17. Check the
Number Matched on top. (6725 in the case of 北.)
18. If the number is
1, add 1 to the count of Hapax Legomenoi and 1 to the count of the Rare Pre-Han
Uses and add this character to the Hapax Legomenoi List.
19. If the number is
more than one but less than or equal to five, add 1 to the count of the Rare Pre-Han
Uses and add this character to the Rare Pre-Han Uses List.
20. Print:
a.
[The title of the text]
b.
Total number of characters = [Total
Characters]
c.
Number of distinct characters = [Distinct
Characters]
d.
Number of Unique Characters = [Unique
Characters]
i. Frequency
of Unique Characters = [Unique Characters/Total Characters]
ii. [Unique
Characters List]
e.
Number of Rare Characters = [Rare Characters]
i. Frequency
of Rare Characters = [Rare Characters/Total Characters]
ii. [Rare
Characters List]
f.
Number of Hapax Legomenoi = [Hapax Legomenoi]
i. Frequency
of Hapax
Legomenoi = [Hapax Legomenoi /Total Characters]
ii. [Hapax Legomenoi List]
g.
Number of Rare Pre-Han Characters = [Rare
Pre-Han Characters]
i. Frequency
of Rare Pre-Han Characters = [Rare Pre-Han Characters/Total Characters]
ii. [Rare
Pre-Han Characters List]
21. Go on to
the next text.