Excellent software and practical tutorials
WP-AutoPostPro teaches you how to automatically collect and publish content in WordPress
WP-AutoPostPro It is the best one currently.WordPressAutomatic collectionThe biggest feature of the publishing plug-in is that it can collect content from any website and automatically publish it to your WordPress site.
WordPress automatic collection The WP-AutoPostPro plugin does not have the above disadvantages. It can truly collect content from any website and publish it automatically. The collection process is completely automatic and does not require human intervention. It also provides content filtering, HTML tag filtering, keyword replacement, automatic linking, automatic labeling, automatic downloading of remote images to the local server, automatic addition of article prefixes and suffixes, and can use the Microsoft translation engine to automatically translate the collected articles into various languages for publication.
WP-AutoPost WordPress automatic collection of Chinese free download address:https://www.lanzouw.com/iLaIOxcfdzi
1. Install WP-AutoPost
Just like installing other WordPress plug-ins, you can upload it directly to the plug-in directory and activate it to use it without any additional settings or code modifications.
2. Create a collection task
After clicking "New Task" and entering the task name, you can create a new task. After creating a new task, you can view the task in the task list and make more settings for the task.
3. Basic settings function
Under the Basic Settings tab, you can set the following:
Task name: You can modify the task name
Category directory: The category directory where the articles collected by this task are published
Author: The author of the article collected by this task must be a registered user in WordPress
Update interval: how often to check if there are new articles to update under this collection task
Character set: The character set encoding of the target website, the default isUTF8If the character set encoding of the target web page is not UTF8, the crawled web page will be garbled. Setting the correct character set can solve this problem (How to set the character set correctly)
Download remote images: If the articles collected under this task contain images, you can choose whether to download the remote images to the local server. If you choose to download remote images, you can further choose whether to save the downloaded image information to the WordPress media library.
Automatic labeling: Choose whether to use automatic labeling
Tag list: After using automatic tagging, if the article contains keywords in the list, the tag will be automatically added
Match complete words: This setting is valid for English articles. Do not enable this setting for Chinese articles.
4. Article source settings
Under this tab, we need to set the article list URL of the article source and the matching rules for specific articles
Let's take "Sina Internet News" as an example. The article list URL is http://roll.tech.sina.com.cn/internet_worldlist/index.shtml. So just enter the URL in the manual article list URL, as shown below:
After that, you need to set the matching rules for the specific article URLs under the article list URL.
5. Article URL Matching Rules
The article URL matching rules are very simple to set up, without any complicated settings. Two matching modes are provided. You can use URL wildcard matching or CSS selector matching. Usually, it is simpler to use URL wildcard matching.
1. Use URL wildcard matching
By clicking on the list URL http://roll.tech.sina.com.cn/internet_worldlist/index.shtml We can find that the URL of each article has the following structure:
http://tech.sina.com.cn/i/2013-06-27/16328485884.shtml
Therefore, just replace the changing numbers or letters in the URL with wildcards (*), such as: http://tech.sina.com.cn/i/(*)/(*).shtml
2. Use CSS selectors for matching
To use CSS selectors for matching, we only need to set the CSS selector of the article URL (if you don’t know what a CSS selector is, learn how to set it in one minute). http://roll.tech.sina.com.cn/internet_worldlist/index.shtml The source code can be easily set up by finding the code for the hyperlink to the specific article under the list URL, as shown below:
After the settings are completed, if you don’t know whether the settings are correct, you can click the test button in the above picture. If the settings are correct, all the article names and corresponding web page addresses under the list URL will be listed, as shown below:
6. Article crawling settings
Under this tab, we need to set the matching rules for article titles and article contents. Two methods are provided for setting. It is recommended to use the CSS selector method, which is simpler and more accurate. (If you don't know what a CSS selector is, learn how to set a CSS selector in one minute)
We only need to set the article title CSS selector and the article content CSS selector to accurately capture the article title and article content.
In the article source settings, we take the collection of "Sina Internet News" as an example. Here we will still use this example to explain. By viewing the list URL http://roll.tech.sina.com.cn/internet_worldlist/index.shtml For example, we can easily set it by looking at the source code of a specific article. http://tech.sina.com.cn/n/i/2013-06-10/06308430630.shtml The source code is as follows:
As you can see, the article title is inside the tag with the id "artibodyTitle", so the article title CSS selector only needs to be set to #artibodyTitle;
Similarly, find the relevant code for the article content:
As you can see, the article content is inside the tag with the id "artibody", so the article content CSS selector only needs to be set to #artibody; as shown below:
After the settings are completed, if you don't know whether the settings are correct, you can click the test button and enter the test address. If the settings are correct, the article title and article content will be displayed, which is convenient for checking the settings.
1. Crawl article page content
If the article is too long and has multiple pages, you can still crawl all the content. In this case, you need to set the article paging link CSS selector. By viewing the source code of the specific article URL, you can find the location of the paging link. For example, the paging link code of an article is as follows:
As you can see, the paging link A tag is inside the tag with class "page-link"
Therefore, the article pagination link CSS selector is set to .page-link a, as shown below:
If you check the box for pagination when publishing, published articles will also be paginated. If your WordPress theme does not support
Label, do not check it.
2. Article content filtering function
The article content filtering function can filter out content that you do not want to publish in the text (such as advertising codes, copyright information, etc.). Two keywords can be set to delete the content between the two keywords. Keyword 2 can be empty, which means deleting all content after keyword 1.
As shown below, after testing and crawling the article, we found that there was content in the article that we did not want to publish. We switched to HTML display, found the HTML code of the content, and set two keywords respectively to filter out the content.
3.HTML tag filtering function
HTML tag filtering function can filter out hyperlinks (a tags) in collected articles.和
等标签下不必要的代码。
如果需要过滤掉文章中的超链接,只需输入 a 即可,是否删除标签内容 选择否;
如果要过滤掉文章中包含的或
不必要的代码,只需输入对应标签名称,是否删除标签内容选择是。