1.1 Data representation - Powered by MinDoc

在计算机科学中，数据表示（Data Representation）是基础概念之一，它决定了计算机如何存储、处理和传输信息。本部分你需要掌握：

二进制数值的表示方式
不同进制之间的转换
二进制运算以及字符数据的存储方式

Number System 数字系统

计算机使用不同的进制（Number Base）来表示数据，常见的有二进制（Binary）、十进制（Denary）、十六进制（Hexadecimal）和二进制编码十进制（BCD）。

进制转换（Number Base Conversion）

(1) 二进制转换为十进制（Binary to Decimal）

将 二进制 数转换为 十进制，需要按位计算，每位的值是乘以该位的数值（0 或 1），再相加：

例 1：
Convert to denary.

解答：

(2) 十进制转换为二进制（Decimal to Binary）

将 十进制 数转换为 二进制，一般使用 除 2 取余法：

将十进制数除以 2，记录余数。
继续对商除以 2，重复步骤 1，直到商变为 0。
将余数倒序排列，即为二进制数。

例 2：
Convert to binary.

解答：

除法计算	商	余数
13 ÷ 2	6	1
6 ÷ 2	3	0
3 ÷ 2	1	1
1 ÷ 2	0	1

所以 .

(3) 二进制转换为十六进制（Binary to Hexadecimal）

十六进制（Hexadecimal）是以 16 为基数的数制，使用 0-9 和 A-F（A=10, B=11, …, F=15）。二进制转换为十六进制的方法：

将二进制数从右往左，每 4 位一组（不足 4 位补 0）。
将每组转换为十六进制。

例 3：
Convert to hexadecimal.

解答：

二进制	1011	1011
十六进制	B	B

所以 .

(4) 十六进制转换为二进制（Hexadecimal to Binary）

十六进制转换为二进制的方法和 (3) 相反，只需要将每个十六进制数字转换为 4 位二进制数，并合并所有二进制数。

例 4：
Convert into binary.

解答：

十六进制	2	F
二进制	0010	1111

所以 .

(5) 十进制和十六进制互转

十进制和十六进制的互转和之前的方法类似。只不过十进制转十六进制是除以 16 取余，并倒序排列余数；
十六进制转十进制是每一位乘 16 的幂，即：

例 5：
(a) Convert to hexadecimal.
(b) Convert to denary.

解答：

(a)

除法计算	商	余数（十六进制）
47 ÷ 16	2	15（F）
2 ÷ 16	0	2

所以 .

(b)

(6) 二进制编码十进制（BCD, Binary Coded Decimal）

BCD（Binary Coded Decimal）是一种用 4 位二进制 表示 十进制数字 的编码方式，每个十进制位单独转换为 4 位二进制。

例 6：
Convert to BCD.

解答：

十进制	2	5
BCD	0010	0101

所以

练习 1

Convert to binary.
Convert to BCD.
Convert to hexadecimal.
Convert to decimal.
Convert to decimal.

二进制数量级与前缀（Binary Magnitudes and Prefixes）

所有计算机都使用二进制来作为其基本组成。每个二进制位被称为一个比特（bit），它来自 Binary Digit 的缩写。
Bit: The basic unit of information in computing and digital communications. A bit can have a value of either 0 or 1, representing off or on states in digital electronics.

每 8 个比特叫做一个字节（byte），它是度量文件大小的基本单位。
Byte: Consists of eight bits. Bytes are the fundamental units for measuring data size. They can represent 256 different values ( ), which is enough to cover a wide range of characters in text.

同时，我们将 4 个比特叫做一个半字节（nibble），它在十六进制表示中非常重要。还记得十六进制和二进制的互转吗？4 个二进制位对应 1 个十六进制位，所以同理，一个 nibble 对应了一个十六进制位。

计算机存储基于 二进制，因此 内存、硬盘容量等存储单位通常使用二进制前缀（KiB、MiB、GiB）。然而，在 数据传输速率、磁盘制造行业，通常使用 十进制前缀（KB、MB、GB），这可能导致存储设备的实际可用空间比标称值更少。

例如：

1 TB（厂商标称）= 1,000,000,000,000 字节
1 TiB（计算机识别）= 1,099,511,627,776 字节
操作系统显示 931 GiB（约等于 1 TB）

二进制前缀	对应值	十进制前缀	对应值
Kibibyte (KiB)	字节	Kilobyte (KB)	字节
Mebibyte (MiB)	字节	Megabyte (MB)	字节
Gibibyte (GiB)	字节	Gigabyte (GB)	字节
Tebibyte (TiB)	字节	Terabyte (TB)	字节
Pebibyte (PiB)	字节	Petabyte (PB)	字节
Exbibyte (EiB)	字节	Exabyte (EB)	字节

例 7：
(a) Convert 2 GiB to MiB.
(b) What is the actual available space, in GiB, on a computer for a storage device with a nominal space of 500 GB?

解答：

(a) .

(b)

练习 2

Convert 4 TiB to GiB.
Explain why storage manufacturers use decimal prefixes while operating systems use binary prefixes.
A hard drive is advertised as 2 TB. How much space does it actually have in TiB?

二进制数的运算（Binary Arithmetic）

二进制加法（Binary Addition）

二进制加法遵循以下规则：

A	B	A + B
0	0	0
0	1	1
1	0	1
1	1	0 （同时向上一位进 1）

例 8：
Calculate

解答：

二进制减法（Binary Subtraction）

二进制减法遵循以下规则：

A	B	A - B	借位
0	0	0	0
0	1	1	1
1	0	1	0
1	1	0	0

例 8：
Calculate

解答：

当 被减数小于减数 时，可以使用 One's Complement（反码）或 Two’s Complement（补码） （详见下文）方法进行计算。

在计算机科学中，补码（Complement Representation） 用于表示 有符号整数（Signed Integers）。计算机存储的整数分为：

无符号整数（Unsigned Integers）：只能表示非负数（例如位存储范围是到）。
有符号整数（Signed Integers）：可以表示负数，常见的方法是 One’s Complement（反码）和 Two’s Complement（补码）。

溢出（Overflow）

溢出（Overflow） 发生在有符号二进制运算（Two’s Complement）中，通常出现在：

正数 + 正数 = 负数
负数 + 负数 = 正数

例 10：

Calculate （Assume there's only 4-bit in total）

在上面的例子中，结果超出了 4 位 Two’s Complement 可表示的范围，因此发生溢出。

溢出检测方法

无符号加法：如果有进位到最高位，则溢出。
有符号加法：
- 两个正数相加得到负数，发生溢出。
- 两个负数相加得到正数，发生溢出。

One's Complement

One’s Complement（反码） 的定义：

正数与 无符号数 相同。
负数通过 对对应正数的二进制表示逐位取反（0 变 1，1 变 0）得到。

按 位 取 反 （ 变 ， 变 ）

示例：

十进制	二进制（8 位）	One’s Complement
+5	`00000101`	`00000101`
-5	`00000101` → `11111010`	`11111010`

在 One’s Complement 中进行加法时，如果结果的最高位产生 进位（Carry），则需要 加回最低位。

在 n 位 Two’s Complement 系统中，数值范围如下：

最 小 值 最 大 值

位数	最小值	最大值
4 位
8 位
16 位

例 11：
Calculate , in One's Complement.

解答：

产 生 全 进 位 回 加

Two's Complement

Two’s Complement（补码） 解决了 One’s Complement 的 双零问题，使加减法更加简便。定义如下：

正数与 无符号数 相同。
负数的表示方式：
- 先取 One’s Complement（反码）。
- 再加 1，得到最终的 Two’s Complement。

十进制	二进制（8 位）	One’s Complement	Two’s Complement
+5	`00000101`	`00000101`	`00000101`
-5	`00000101` → `11111010`	`11111010`	`11111011`

Two’s Complement 使得 加法和减法计算 变得简单，因为：

加法和减法可以直接进行，不需要特殊处理进位。
负数自动在二进制中表示为补码，所以如果用 Two's Complement来做例 11，直接相加即可。

例 12：
Calculate , in Two's Complement.

解答：

在 n 位 Two’s Complement 系统中，数值范围如下：

最 小 值 最 大 值

位数	最小值	最大值
4 位
8 位
16 位

BCD 的加减法

加法 BCD 码需要特殊的加法规则：

将两个 BCD 数直接相加；
如果结果 ≥ 10（即 1010₂ 或以上），则加 6（0110₂）进行修正。

例 13：计算（即 9 + 4）

解答：

减法

BCD 码的减法通常使用 十进制补码（Ten’s Complement） 方法：

求减数的 Ten’s Complement：
- 计算 9’s Complement（每位）
- 再加
- 与被减数相加。

二进制运算和溢出的实际应用（概念）：
In real-world computing, binary arithmetic and overflow management are essential in both hardware and software design.

Design and System Architecture In system design, accommodating the possibility of overflow is vital to ensure robust and reliable operations.

Real-World Scenarios From simple calculators to complex operating systems, binary operations and overflow considerations are omnipresent. Effective handling of these elements is critical in the development of resilient and efficient software and hardware systems.

例 14：
Explain the concept of overflow in the context of binary addition. Provide an example where overflow occurs in an 8-bit binary addition operation.

解答：
Overflow in binary addition refers to a situation where the sum of two binary numbers exceeds the maximum number that can be represented with a given number of bits. In an 8-bit system, the highest representable number is 11111111 (255 in decimal). For instance, consider adding the binary numbers 11110000 (240 in decimal) and 10010000 (144 in decimal). The sum is 110000000, which is a 9-bit number. Since an 8-bit system can only accommodate 8 bits, the leftmost bit (1 in this case) cannot be represented, leading to overflow. This overflow signifies that the actual sum is beyond the range of what the system can represent, thus producing an incorrect result or potentially causing errors in computations.

练习 3

Convert to Two’s Complement (8-bit).
Find the Two’s Complement of (8-bit).
Compute using Two’s Complement.
Determine the range of numbers that can be represented in a 10-bit Two’s Complement system.
Perform BCD addition for .
Perform BCD subtraction for .
Describe how two's complement is used for representing negative numbers in binary and demonstrate with an example how it aids in binary subtraction.

字符编码

计算机只能存储和处理 二进制数据（0 和 1），但人类使用 字母、数字、符号 进行交流。因此，需要一种方法 将字符转换为二进制，即 字符编码（Character Encoding）。

ASCII 码

ASCII 是最早广泛使用的字符编码，使用 7 位 或 8 位 二进制数表示字符。

标准 ASCII

使用 7 位二进制 表示 128 个字符（0 到 127）。
主要包括：
- 控制字符（0-31）：如 NULL（0）、BEL（7, 响铃）
- 数字（48-57）：如 0（48 或 011 0000₂）
- 大写字母（65-90）：如 A（65 或 100 0001₂）
- 小写字母（97-122）：如 a（97 或 110 0001₂）

A Level 并不要求背过所有的 ASCII 码，但尽量掌握下表中的 ASCII

字符	ASCII 码（二进制）	ASCII 码（十进制）
0	0110000	48
A	1000001	65
a	1100001	97

0 的 ASCII 为 48，由此，数字 0 - 9 的 ASCII 即为 48 - 57；
同理，大写字母 A - Z 的 ASCII 为 65 - 90；小写字母 a - z 的 ASCII 为 97 - 122.

在答题时，注意题目中给出的 ASCII 是十进制还是十六进制，如十进制 ASCII 的字母 A 是 65，而十六进制 ASCII 的字母 A 是 41！

本质上，大写字母的第 6 二进制位为 0，而小写字母就是将第 6 位从 0 变为 1，如：

$：$

$：$

扩展 ASCII

扩展 ASCII（Extended ASCII）将字符集的范围从 7 位扩展到了 8 位，因此其字符容量也随之翻倍。它多收录了更多拉丁字母，以及增加了希腊字母、数学符号、制表符等内容，使得字符集能够兼容更多语言。

扩展 ASCII 主要用于早期计算机系统，现已被 Unicode 码取代

Unicode 码

Unicode 设计用于支持 世界上所有书写系统，使用 可变长度编码（UTF-8、UTF-16、UTF-32），分别对应 8 位、16 位和 32 位。

Unicode 的优点：

支持全球语言字符（包括汉字、阿拉伯文、日文等）。
向后兼容 ASCII（前 128 个字符与 ASCII 相同，且数字等字符仍然用 1 字节来存储）。
使用多个编码方案（UTF-8、UTF-16、UTF-32）。

如：

字符	UTF-8 二进制表示
A	01000001
é	11000011 10101001
中	11100100 10111000 10101000

字符	UTF-16 二进制表示
A	00000000 01000001
é	00000000 11101001
中	01001110 00101111

ASCII 和 Unicode 都使用固定位宽存储字符：

$存储空间字符数编码位宽$

如：存储 Hello
ASCII（8-bit）: bits
UTF-8（全部 ASCII 范围）: bits
UTF-16（假设基本字符）: bits

例 15：
(a) Convert C to ASCII (denary).
(b) How many bits are required to store "Data" in UTF-16?

解答：

(a)

(b) 4 characters so bits

字符编码的应用（概念）：

Explain how the Binary Coded Decimal (BCD) system is used in digital clocks and describe one advantage of using BCD in this context over using pure binary representation.

解答：
The Binary Coded Decimal (BCD) system is utilised in digital clocks by representing each decimal digit of the time with its own group of four binary digits. A significant advantage of using BCD over pure binary representation is the ease of converting BCD to human-readable decimal form. In digital clocks, where time is displayed in a format easily understood by humans, BCD's structure makes the conversion straightforward and efficient, reducing computational complexity and enhancing the clock's performance in displaying time accurately.

Describe two advantages of using Unicode over ASCII for character data representation and explain why these advantages are significant in the context of global digital communication.

解答：
Unicode offers two key advantages over ASCII in character data representation: comprehensive language support and consistency across different platforms. Firstly, unlike ASCII, which is limited to 128 or 256 characters, Unicode can represent over a million characters. This extensive range enables Unicode to support virtually every language and script in use around the world, including complex characters and symbols from diverse cultures. This global language support is crucial for facilitating international communication and software development, making digital content accessible and understandable worldwide. Secondly, Unicode ensures consistency in text representation across various devices and platforms. This uniformity is essential for maintaining the integrity and readability of digital texts, regardless of the system or application used, thus fostering seamless global communication and information exchange. These advantages make Unicode indispensable in our increasingly interconnected digital world, breaking down language barriers and enabling a truly global digital community.

练习 4

Convert 4 to ASCII (denary).
Convert ASCII 44 (hexadecimal) to character.
What is the ASCII value of the character b in decimal and binary?

作者：张业浩创建时间：2024-07-03 03:38
最后编辑：admin 更新时间：2025-03-17 12:13