跳转到内容

User:SchlurcherBot

页面内容不支持其他语言。
维基百科,自由的百科全书

SchlurcherBot

Function overview: Convert links from http:// to https://

Rationale:

Programming language: C#

Source code available: Main C# script: commons:User:SchlurcherBot/LinkChecker

Namespaces: This bot only edits on namespace 0 (Main) and 6 (File)

Function details: The link checking algorithm is as follows:

  1. The bot extracts all http-links from the parsed html code of a page.
    • It searches for all href elements and extracts the links.
    • It does not search the wikitext, and thus does not rely on any Regex.
    • This is also to avoid any problems with templates that modify links (like archiving templates).
    • Links that are subsets of other links are filtered out to minimize search and replace errors.
  2. The bot checks if the identified http-links also occur in the wikitext, otherwise they are skipped.
  3. The bot checks if both the http-link and the corresponding https-link is accessible.
    • This step also uses a blacklist of domains that were previously identified as not accessible.
  4. If both links redirect to the same page, the http-link will be replaced by the https-link (the link will not be changed to the redirect page, the original link path will be kept).
  5. If both Links are accessible and return a success code (2xx), it will be checked if the content is identical.
    1. If the content is identical, and the link is directly to the host, then the http-link will be replaced by the https-link.
    2. If the content is identical but not the host, it will be checked if the content is identical to the host link, only if the content is different, then the http-link will be replaced by the https-link.
      • This step is added as some hosts return the same content for all their pages (like most domain sellers, some news sites or pages in ongoing maintenance).
    3. If the content is not identical, it will be checked if the content is at least 99.9% identical (calculated via the en:Levenshtein distance).
      • This step is added as most homepages use dynamic IDs for certain elements, like for ad containers to circumvent Ad Blockers.
    4. If the content is at least 99.9% identical, the same host check as before will be performed.
    5. If any of the checked links fails (like Code 404), then nothing will happen.

Source for pages: The bot works on the list of pages identified through the external links SQL dump. The list was scrambled to ensure that subsequent edits are not clustered from a specific area.

Further comments: The bot respects the API etiquette and uses both a user-agent header as well as respects the maxlag parameter.

Status: (CentralAuth)

Approved as global bot (per this request) and thus flagged as bot on all projects that did not opt-out (per this list).

Project Request Pages Edit Description Used Status
commonswiki Approved 31'145'089 Fix http to https Working Waiting
dewiki Approved 1'888'381 Bot: http → https Working Waiting
enwiki Approved 8'570'327 Bot: http → https Working Waiting
eswiki Approved 2'191'542 Bot: http → https Working Waiting
frwiki Approved 2'970'187 Bot: http → https Working Waiting
itwiki Approved 2'359'233 Bot: http → https Working Waiting
jawiki Allows global bots 994'375 Bot: http → https Running…
plwiki Approved 1'527'763 Bot: http → https Running…
ptwiki Approved 1'214'889 Bot: http → https Working Waiting
ruwiki Allows global bots 1'797'992 Bot: http → https Working Waiting
zhwiki Allows global bots 1'105'051 Bot: http → https Working Waiting
dewikinews Approved 17'280 Bot: http → https Done
dewikiquote Pending 5'673 Bot: http → https  搁置
dewikisource Approved 97'284 Bot: http → https Done
dewikiversity Approved 9'301 Bot: http → https Done
dewikivoyage Approved 19'094 Bot: http → https Done
dewiktionary Approved 145'334 Bot: http → https Done
altwiki Pending (Village Pump) 864 Bot: http → https  搁置
arywiki [1] 8'953 Bot: http → https
bnwiki [2] 172'842 Bot: http → https
bnwikibooks Pending (Village Pump) 745 Bot: http → https  搁置
bswiki [3] 79'281 Bot: http → https
cswikibooks [4] 601 Bot: http → https
cswikisource [5] 46'231 Bot: http → https
cswikiversity [6] 1'946 Bot: http → https
dvwiki [7] 1'088 Bot: http → https
enwikisource [8] 117'079 Bot: http → https
enwiktionary [9] 443'242 Bot: http → https
eswikibooks [10] 3'417 Bot: http → https
eswikinews [11] 14'339 Bot: http → https
eswikisource Pending (Village Pump) 5'826 Bot: http → https  搁置
frwikibooks [12] 7'632 Bot: http → https
frwikinews [13] 19721 Bot: http → https
frwikisource Pending (Village Pump) 42'309 Bot: http → https  搁置
frwikiversity [14] 4'126 Bot: http → https
frwikivoyage [15] 8'536 Bot: http → https
frwiktionary [16] 532'493 Bot: http → https
fywiki Pending (Village Pump) 30'516 Bot: http → https  搁置
glwiki [17] 213'696 Bot: http → https
hewikibooks Pending (Village Pump) 1'660 Bot: http → https  搁置
hewikisource Pending 98'820 Bot: http → https  搁置
hewikivoyage Allows global bots 2'038 Bot: http → https
hewiktionary Pending (Village Pump) 6'559 Bot: http → https  搁置
hiwiktionary [18] (Village Pump) 4'970 Bot: http → https
hrwikibooks [19] (Village Pump) 428 Bot: http → https
hrwikiquote [20] (Village Pump) 1'254 Bot: http → https
huwikibooks [21] (Village Pump) 18'488 Bot: http → https
huwikisource [22] (Village Pump) 7'222 Bot: http → https
idwiki [23] 673'383 Bot: http → https
iswiki [24] (Village Pump) 30'026 Bot: http → https
iswikisource [25] (Village Pump) 38 Bot: http → https
iswiktionary [26] (Village Pump) 17'145 Bot: http → https
itwikinews [27] 12'880 Bot: http → https
itwikivoyage [28] 8'563 Bot: http → https
itwiktionary [29] 80'610 Bot: http → https
jawikibooks [30] (Village Pump) 1'873 Bot: http → https
jawiktionary [31] (Village Pump) 8'834 Bot: http → https
kshwiki [32] (Village Pump) 1'364 Bot: http → https
lawikisource [33] (Village Pump) 9'453 Bot: http → https
liwikisource [34] (Village Pump) 1'080 Bot: http → https
liwiktionary [35] 86 Bot: http → https
mnwwiki [36] 1'010 Bot: http → https
mrwiki [37] 59'852 Bot: http → https
mrwikisource [38] (Village Pump) 1'372 Bot: http → https
mtwiki [39] 4'626 Bot: http → https
ndswiki [40] 24'342 Bot: http → https
nlwikivoyage [41] 2'385 Bot: http → https
nnwiki [42] (Village Pump) 144'877 Bot: http → https
outreachwiki [43] 6'136 Bot: http → https
plwikiquote [44] (Village Pump) 10'443 Bot: http → https
plwiktionary [45] (Village Pump) 92'252 Bot: http → https
ptwikibooks [46] (Village Pump) 4'917 Bot: http → https
rowiki [47] 608'015 Bot: http → https
rowiktionary [48] (Village Pump) 82'646 Bot: http → https
ruwikinews [49] (Village Pump) 833'044 Bot: http → https
ruwikisource [50] 213'855 Bot: http → https
ruwiktionary [51] 19'274 Bot: http → https
slwiki [52] 135'056 Bot: http → https
slwikisource [53] 16'902 Bot: http → https
sourceswiki [54] (Village Pump) 26'478 Bot: http → https
specieswiki [55] (Village Pump) 640'405 Bot: http → https
srwiki [56]] 659'443 Bot: http → https
srwikibooks [57] (Village Pump) 935 Bot: http → https
svwikisource [58] 1'791 Bot: http → https
svwikiversity [59] (Village Pump) 374 Bot: http → https
svwikivoyage [60] 1'327 Bot: http → https
ukwiki [61] 1'280'019 Bot: http → https
urwiki [62] 148'606 Bot: http → https
vecwikisource [63] (Village Pump) 4'875 Bot: http → https
viwiki [64] 1'350'516 Bot: http → https
wuuwiki [65] 6'016 Bot: http → https
yuewiktionary [66] (Village Pump) 401 Bot: http → https