参考资料:https://docs.microsoft.com/en-us/aspnet/core/security/cross-site-scripting
Customizing the Encoders
By default encoders use a safe list limited to the Basic Latin Unicode range and encode all characters outside of that range as their character code equivalents. This behavior also affects Razor TagHelper and HtmlHelper rendering as it will use the encoders to output your strings.
The reasoning behind this is to protect against unknown or future browser bugs (previous browser bugs have tripped up parsing based on the processing of non-English characters). If your web site makes heavy use of non-Latin characters, such as Chinese, Cyrillic or others this is probably not the behavior you want.
You can customize the encoder safe lists to include Unicode ranges appropriate to your application during startup, in ConfigureServices()
.
For example, using the default configuration you might use a Razor HtmlHelper like so;
<p>This link text is in Chinese: @Html.ActionLink("汉语/漢語", "Index")</p>
When you view the source of the web page you will see it has been rendered as follows, with the Chinese text encoded;
<p>This link text is in Chinese: <a href="/">汉语/漢語</a></p>
To widen the characters treated as safe by the encoder you would insert the following line into the ConfigureServices()
method in startup.cs
;
services.AddSingleton<HtmlEncoder>(
HtmlEncoder.Create(allowedRanges: new[] { UnicodeRanges.BasicLatin,
UnicodeRanges.CjkUnifiedIdeographs }));
This example widens the safe list to include the Unicode Range CjkUnifiedIdeographs. The rendered output would now become
<p>This link text is in Chinese: <a href="/">汉语/漢語</a></p>
Safe list ranges are specified as Unicode code charts, not languages. The Unicode standard has a list of code charts you can use to find the chart containing your characters. Each encoder, Html, JavaScript and Url, must be configured separately.
Note
Customization of the safe list only affects encoders sourced via DI. If you directly access an encoder via System.Text.Encodings.Web.*Encoder.Default
then the default, Basic Latin only safelist will be used.