博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
[译] 最详细的 CSS 字符转义处理
阅读量:6219 次
发布时间:2019-06-21

本文共 10291 字,大约阅读时间需要 34 分钟。

原文

作者发表时间:12th July 2010
译者:西楼听雨 发表时间: 2018/11/5 (转载请注明出处)

查看原文 When writing CSS for [markup with weird `class` or `id` attribute values](https://mathiasbynens.be/notes/html5-id-class), you need to [consider](https://www.w3.org/TR/CSS21/syndata.html#characters) some [rules](https://www.w3.org/International/questions/qa-escapes#cssescapes). For example, you can’t just use `## { color: #f00; }` to target the element with `id="#"`. Instead, you’ll have to escape the weird characters (in this case, the second `#`). Doing so will cancel the meaning of special CSS characters in identifiers and allows you to refer to characters you cannot easily type out, like crazy Unicode symbols.

There are some other cases where you might want or need to escape a character in CSS. You could be writing a selector for a funky id, class, attribute or attribute value, for example; or maybe you want to insert some weird characters using the content property without changing your CSS file’s character encoding.

当为一些 编写 CSS 样式时,我们需要到一些。例如:不能直接使用 ## { color: #foo; } 来匹配到 id="#" 这样的元素;而是应该将这些怪异的字符进行转义(这个例子中,指第二个“#”),这样做就可以消除标识符中包含的特殊 CSS 字符的涵义,还可以引用到不能简单敲出来的字符,如令人抓狂的 Unicode 符号。

还有一些其他情况你可能或需要在 CSS 中使用转义来对一个字符进行转义。举个例子:你可能会为一个有趣的 id 、class、属性或属性值写一个选择器;或者你想要在不改变 CSS 文件的字符编码的条件下使用 content 属性表达式来插入一个奇怪的字符。

CSS 中的标识符和字符串

查看原文 [The spec](http://dev.w3.org/csswg/css-syntax/#ident-token-diagram) defines *identifiers* using a token diagram. They may contain the symbols from `a` to `z`, from `A` to `Z`, from `0` to `9`, underscores (`_`), hyphens `-`, non-ASCII symbols or escape sequences for any symbol. They cannot start with a digit, or a hyphen (`-`) followed by a digit. Identifiers require at least one symbol (i.e. the empty string is not a valid identifier).

The grammar for identifiers is used for various things throughout the specification, including element names, class names, and IDs in selectors.

The spec definition for says that strings can either be written with double quotes or with single quotes. Double quotes cannot occur inside double quotes, unless escaped (e.g., as '\"' or as '\22'). The same goes for single quotes (e.g., "\'" or "\27"). A string cannot directly contain a newline. To include a newline in a string, use an escape sequence representing the line feed character (U+000A), such as "\A" or "\00000a". Newlines can also be represented as "\D \A " (CRLF), "\D " (i.e. \r in other languages), or "\C " (i.e. \f in other languages). It’s possible to break strings over several lines, for aesthetic or other reasons, but in such a case the newline itself has to be escaped with a backslash (\).

As you can see, character escapes are allowed in both identifiers and strings. So, let’s find out how these escape sequences work!

规范中用 token 图定义了标识符:可以包含 a 到 z 、A 到 Z、0 到 9、下划线、连接符、非 ASCII 字符及针对任何字符的转义序列;但不能以数字符号或者连接符号紧接一个数字符号开头;且标识符至少包含一个字符(即空字符是不正确的标识符)。

标识符的语法被这个规范的许多地方所引用,包括元素名称,class 名称,选择器中的 id。

对于,这个规范讲到:可以用双引号或者单引号来表达,但双引号不能出现在双引号内,除非对其进行转义(例如,'\"''\22');同理,单引号也一样;字符串不能直接包含换行,需要使用转义序列来表示换行符(U+000A),如 “\A” 或 “\00000a” ;换行也可以用 “\D \A” (CRLF) 、“\D ”(即其他语言中的 "\r")或 “\C ”(即其他语言中的 “\f”)来表示;为了美观或其他原因的需要,字符串可以拆分为几行,但需要用反斜杠(\)对换行符本身进行转义。

现在我们知道,字符的转义在标识符和字符串中都支持,所以,下面我们来看一下这些转义序列是如何用的。

CSS 中如何进行转义

查看原文 Here’s a ~~simple~~ list of rules you should keep in mind when escaping a character in CSS. Keep in mind that if you’re writing a selector for a given classname or ID, the strict syntax for identifiers applies. If you’re using a (quoted) string in CSS, you’ll only ever need to escape quotes or newline characters.

在 CSS 中对一个字符进行转义时,你应该记住下文中的这些规则。如果对一个 class 或 id 写一个选择器,需要对其使用严格语法;如果要在 CSS 中使用字符串(包含引号),你只需要对引号和换行符进行转义。

开头数字

If the first character of an identifier is numeric, you’ll need to escape it based on its Unicode code point. For example, the code point for the character 1 is U+0031, so you would escape it as \000031 or \31.

Basically, to escape any numeric character, just prefix it with \3 and append a space character (). Yay Unicode!

如果一个标识符的第一个字符是数字时,需要用其 Unicode 码来进行转义。例如,1 的 Unicode 代码点为 U+0031,那么就用 \000031 或者 \31  来转义。

基本上,所有数字字符的转义,都只需要在其之前附加 \3 及之后附加空格即可。

特殊字符

查看原文 Any character that is not a hexadecimal digit, line feed, carriage return, or form feed can be escaped with a backslash to remove its special meaning.

The following characters have a special meaning in CSS: !, ", #, $, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, @, [, \, ], ^, ```, {

, |, }, and ~.

There are two options if you want to use them. Either you use the Unicode code point — for example, the plus sign (+) is U+002B, so if you would want to use it in a CSS selector, you would escape it into \2b(note the space character at the end) or \00002b (using exactly six hexadecimal digits).

The second option is far more elegant though: just escape the character using a backslash (\), e.g. + would escape into \+.

Theoretically, the : character can be escaped as \:, but IE < 8 doesn’t recognize that escape sequence correctly. A workaround is to use \3Ainstead.

任何不是 16 进制的数字 (即 0到9 和 a-f ——译注)、换行、回车、换页的字符都可以通过反斜杠来消除它的特殊含义。

后面这些字符在 CSS 中是有特殊含义的:!, ", #, $, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, @, [, \, ], ^, `, ,, {

, |, }, and ~

如果你想使用这些字符,有两种选择:第一种,使用 Unicode 代码点,例如,加号 (+) 的代码点是 U+002B,则使用 \2b (注意结尾处的空格)或者 \00002b (使用完整的 6 位16进制数字)来转义。第二种则比较优雅一点,只需使用反斜杠 (\) 即可,例如,+ 使用 \+ 转义。

理论上,: 可以使用 \: 来转义,但 IE8 以下版本不能正确地识别出这个转义序列,一个解决方案是转而使用 \3A

空白字符

Whitespace — even some characters that are technically invalid in HTML attribute values — can be escaped as well.

Any characters matching [\t\n\v\f\r] need to be escaped based on their Unicode code points. The space character () can simply be backslashed (\). Other whitespace characters don’t need to be escaped.

空白字符——虽然有些字符从技术上讲,在 HTML 属性值里是错误——也可以使用转义。

能够匹配 [\t\n\v\f\r] 的字符都需要根据 Unicode 码来进行转义;空格字符 () 仅需用反斜杠进行转义("\ ");其他空白字符则不需要转义。

下划线

CSS doesn’t require you to escape underscores (_) but if it appears at the start of an identifier, I’d recommend doing it anyway to prevent IE6 from ignoring the rule altogether.

CSS 对下划线 (_) 没有转义要求,但如果是出现在标识符的开头的话,我还是建议一定要做一下转义,以此避免 IE6 把整体样式规则都忽略。

其他 Unicode 字符

查看原文 Other than that, characters that can’t possibly convey any meaning in CSS (e.g. `♥`) can and **should** just be used unescaped.

In theory (as per the spec), any character can be escaped based on its Unicode code point as explained above (e.g. for ?, the U+1D306 “tetragram for centre” symbol: \1d306or \01d306), but ().

Because of browser bugs, there is another () way to escape these characters, namely by breaking them up in UTF-16 code units (e.g. \d834\df06), but this syntax (rightfully) isn’t supported in and.

Since there is currently to escape in a cross-browser fashion without breaking backwards compatibility with older browsers, it’s best to just use these characters unescaped.

除了上面说的这些字符外,其他没有任何含义的字符都可以也应该保持不转义。

从规则上讲,所有字符都可以用其 Unicode 代码点来进行转义——就像上面提到的那样。(例如:四横线? 的代码点为 U+1D306,可以用 \1d306 或者 \01d306 来转义),但老的 Webkit 浏览器对于不在 BMP 平面(BMP 是 Unicode 规范所划分的一种字符平面,包含最常用的字符。每个平面都有 65536 即 2 的 16 次方个字符。——译注)之内的字符是不支持这种转义的](。()

因为浏览器存在的 bug,还有另外一种方式(非标准方式)来对这些 BMP 之外的字符进行转义,即,将他们的 UTF-16 的代码点拆分开(如:\d834\df06),但这种语法不被 和 所支持。

由于目前没有任何一种跨浏览器兼容的方式来对非 BMP 平面的字符进行转义,所以最好就是不要对其进行转义。

十六进制转义序列的尾部空白字符

查看原文 Any U+0020 space characters immediately following a hexadecimal escape sequence are automatically [consumed by the escape sequence](http://dev.w3.org/csswg/css-syntax/#consume-escaped-code-point). For example, to escape the text `foo © bar`, you would have to use `foo \A9 bar`, with two space characters following `\A9`. The first space character gets swallowed; only the second one is preserved.

The space character following a hexadecimal escape sequence can only be omitted if the next character is not another space character and not a hexadecimal digit. For example, foo©bar becomes foo\A9 bar, but foo©qux could be written as foo\A9qux.

紧接在十六进制转义序列之后的空格字符(U+0020)都会自动被视为这个转义序列的一部分。例如,对文本foo © bar进行转义,需要使用 foo \A9 bar 来完成,其中 \A9 之后需要用到两个空格,第一个空格会被吸收掉,只有第二个空格才会被保留。

紧接在十六进制转义序列之后的空格字符只有在其下一个字符不是空格字符且不是十六进制的数字字符的情况下,才可以省略。例如,foo©bar 对应 foo\A9 bar ,而 foo©qux 则可以写成 foo\A9qux

示例

下面用一些随便举的例子来进行演示:

.\3A \`\( { } /* 匹配 class=":`(" 的元素 */.\31 a2b3c { } /* 匹配 class="1a2b3c" 的元素 */#\#fake-id {} /* 匹配 id="#fake-id" 的元素 */#-a-b-c- {} /* 匹配 id="-a-b-c-" 的元素 */#© { } /* 匹配 id="©" 的元素*/复制代码

查看更多,请点击为这篇贴文 () 写的 。

… 那么 JS 中的情况呢?

查看原文 In JavaScript, it depends.

document.getElementById() and similar functions like document.getElementsByClassName() can just use the unescaped attribute value, the way it’s used in the HTML. Of course, you would have to any quotes so that you still end up with a valid JavaScript string.

On the other hand, if you were to use these selectors with the (i.e. document.querySelector()and document.querySelectorAll()) or libraries that rely on the same syntax (e.g. jQuery/Sizzle), you would have to take the and escape them again. All you really have to do is double every backslash in the CSS selector (and of course escape the quotes, where necessary):

对于 JavaScript 来说,要看情况来。

document.getElementById() 及类似的方法,如 document.getElementsByClassName() 直接使用未经转义之前的属性值即可 (指 HMTL 中所使用的那种转义方式) 。当然,对于引号来说你仍然还需要进行转义,以确保字符串语法正确。

而如果你使用的 (即 document.querySelector() 和 document.querySelectorAll() ) 或者使用的是依赖同样语法的库时,你必须使用转义后的 CSS 选择器,然后再进行转义——你所需要做的就是把每个反斜线加倍(如果有需要的话,当然也包括引号的转义)。

复制代码
/* CSS */.\3A \`\( { }复制代码
/* JavaScript */document.getElementsByClassName(':`(');document.querySelectorAll('.\\3A \\`\\(');复制代码

CSS 转义工具

Remembering all these rules sure sounds like fun, but to make life a little easier I created .

这些规则确实比较有趣,但为了让事情更简单,我制作了[一个简单地 CSS 转义工具,它可以帮你完成这些复杂的工作]()。

查看原文 Just enter a value and it will tell you how to escape it in CSS and JavaScript, based on the rules above. It uses an `id` attribute in its example, but of course you could use the the same escaped string for `class` attribute values or the `content` property. Enjoy!

Need to escape text for use in CSS strings or identifiers? I’ve packaged the code that powers this tool as an open-source JavaScript library named . Check it out!

只需(在高亮的位置——译注)输入一个值,它就会告诉你如何基于以上所述规则在 CSS 和 JavaScript 中进行转义。虽然这里使用的是 id 属性,但其实你也把获取到的转义后的字符串用于 class 属性值及 content 属性。请享用!

没有在 CSS 中对字符串或标识符进行转义的需求?我把支撑这个工具的代码打包成了一个开源的 JavaScript 库,名字叫做 ,请查看!

更新: CSS 对象模型规范现在已经定义了一个 ` 方法,该方法可以用来执行转义。我制作了一个它的。

你可能感兴趣的文章
利用kseq.h parse fasta/fastq 文件
查看>>
LabVIEW串口通信
查看>>
JavaScript变量和作用域
查看>>
深度学习笔记:优化方法总结(BGD,SGD,Momentum,AdaGrad,RMSProp,Adam)
查看>>
ConEmu配置task的脚本
查看>>
iOS——多线程编程详细解析
查看>>
MySQL mysqlbinlog解析出的SQL语句被注释是怎么回事
查看>>
JAVA中String.format的用法 转16进制,还可以补0
查看>>
.NET Core的依赖注入[1]: 控制反转
查看>>
hive 提取用户第一次浏览/购买 某商品的 时间
查看>>
[ARCH] 1、virtualbox中安装archlinux+i3桌面,并做简单美化
查看>>
MVVM模式下关闭窗口的实现
查看>>
程序员晋级CTO之路的8大准则
查看>>
linux curl 命令详解,以及实例
查看>>
CentOS7 下 keepalived 的安装和配置
查看>>
R绘图 第七篇:绘制条形图(ggplot2)
查看>>
Perl输出复杂数据结构:Data::Dumper,Data::Dump,Data::Printer
查看>>
安装Cloudera manager Server步骤详解
查看>>
Windows 10原版ISO下载地址(持续更新)
查看>>
js 日期 相关
查看>>