Documentation
Sign up for free!
Get instant access to the API with your free API token. No billing details required!
Getting Started
Introduction
Our API was developed to extract news from almost any source.
To get started simply sign up and use your API token in any of the available API endpoints documented below for instant access.
If you have any questions or concerns, feel free to contact us.
Authentication
As mentioned above, when you sign up for free you will find your API token on your dashboard. Simply add this to any of our API endpoints as a GET parameter to gain access. Examples of how this is done can be found below.
API Endpoints
Article Extraction
Endpoint
GET https://api.articlextractor.com/v1/extract HTTP/1.1
If you have issues with your requests, please ensure your GET parameters are URL-encoded.
All text data returned is UTF-8.
All dates are in UTC (GMT).
HTTP GET Parameters
name | required | description |
---|---|---|
api_token |
true | Your API token which can be found on your account dashboard. |
url |
true | Url of the articles which you want to extract. |
include_html |
false | Optionally include the full HTML of the article. Not included by default. |
language |
false | Specify a language for stopwords. If this is omitted we try to detect the language. |
Response Objects
name | description |
---|---|
data > publish_date |
The article publish date. |
data > source_url |
The url of the article source. |
data > url |
The article url. |
data > canonical_link |
The canonical link of the article. |
data > title |
The article title. |
data > top_image |
The top main image of the article. |
data > images |
An array of all images within the article. |
data > videos |
An array of all videos within the article. |
data > text |
The full extracted text of the article. |
data > tags |
The datetime the article was published. |
data > authors |
An array of authors. |
data > meta_image |
Meta data image. |
data > meta_description |
Meta data description. |
data > meta_keywords |
Meta data keywords. |
data > meta_lang |
Meta data language. |
data > meta_favicon |
Meta data favicon. |
data > meta_site_name |
Meta data site name. |
data > meta_data |
An array of all unstructured meta data within the article. |
data > html |
Optional - HTML of the article. See include_html parameter for more information. |
Supported stopword languages
code | full name |
---|---|
ar |
Arabic |
ru |
Russian |
nl |
Dutch |
de |
German |
en |
English |
es |
Spanish |
fr |
French |
he |
Hebrew |
it |
Italian |
ko |
Korean |
no |
Norwegian |
fa |
Persian |
pl |
Polish |
pt |
Portuguese |
sv |
Swedish |
hu |
Hungarian |
fi |
Finnish |
da |
Danish |
zh |
Chinese |
id |
Indonesian |
vi |
Vietnamese |
sw |
Swahili |
tr |
Turkish |
el |
Greek |
uk |
Ukrainian |
Example Request
GET https://api.articlextractor.com/v1/extract?url=https%3A%2F%2Fedition.cnn.com%2F2022%2F09%2F12%2Fworld%2Fjames-webb-space-telescope-image-orion-nebula-scn%2Findex.html&language=en&api_token=YOUR_API_TOKEN
Example Response
{
"data":{
"publish_date": "2022-09-12 14:00:05+00:00",
"source_url":"https://edition.cnn.com",
"url":"https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html",
"canonical_link":"https://www.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html",
"title":"'Breathtaking' Webb images to reveal the secrets of star birth",
"top_image":"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?q=w_800,c_fill",
"images":[
"https://media.cnn.com/api/v1/images/stellar/prod/220912095043-03-james-webb-space-telescope-orion-nebula.jpg?c=16x9&q=h_270,w_480,c_fill",
"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?c=16x9&q=h_270,w_480,c_fill",
"https://media.cnn.com/api/v1/images/stellar/prod/150825105557-spacescience-logo-large-169.png?q=h_249,w_650,x_0,y_0/w_1280",
"https://media.cnn.com/api/v1/images/stellar/prod/150103074330-hubble-space-background-2-full-169.jpg?q=w_1600,h_900,x_0,y_0,c_crop/h_270,w_480",
"https://media.cnn.com/api/v1/images/stellar/prod/141217021344-katie-hunt.jpg?c=16x9&q=h_270,w_480,c_fill/c_thumb,g_face,w_100,h_100",
"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?q=w_800,c_fill"
],
"videos":[],
"text":"CNN —
“Breathtaking” images of a stellar nursery in the Orion Nebula taken by the James Webb Space Telescope are revealing intricate details about how stars and planetary systems form.
The images, released Monday, shed light on an environment similar to our own solar system when it formed more than 4.5 billion years ago. Observing the Orion Nebula will help space scientists better understand what happened during the first million years of the Milky Way’s planetary evolution, said Western University astrophysicist Els Peeters in a news release.
“We are blown away by the breathtaking images of the Orion Nebula. We started this project in 2017, so we have been waiting more than five years to get these data,” Peeters said.
“These new observations allow us to better understand how massive stars transform the gas and dust cloud in which they are born,” Peeters added.
The inner region of the Orion Nebula as seen by the James Webb Space Telescope's NIRCam instrument. NASA/ESA/CSA/PDRS4all
The hearts of stellar nurseries like the Orion Nebula are obscured by large amounts of stardust, making it impossible to study what is happening inside with instruments like the Hubble Space Telescope, which rely mainly on visible light.
Webb, however, detects the infrared light of the cosmos, which allows observers to see through these layers of dust, revealing the action happening deeply inside the Orion Nebula, the release said. The images are the most detailed and sharpest taken of the nebula – which is situated in the Orion constellation 1,350 light-years away from Earth – and the latest offering from the Webb telescope, which began operating in July.
“Observing the Orion Nebula was a challenge because it is very bright for Webb’s unprecedented sensitive instruments. But Webb is incredible, Webb can observe distant and faint galaxies, as well as Jupiter and Orion, which are some of the brightest sources in the infrared sky,” said research scientist Olivier Berné at CNRS, the French National Center for Scientific Research, in the news release.
The new images reveal numerous structures inside the nebula, including proplyds – a central protostar surrounded by a disk of dust and gas in which planets form.
“We have never been able to see the intricate fine details of how interstellar matter is structured in these environments, and to figure out how planetary systems can form in the presence of this harsh radiation. These images reveal the heritage of the interstellar medium in planetary systems,” said Emilie Habart, an associate professor at Institut d’Astrophysique Spatiale (IAS) in France.
Also clearly visible at the heart of the Orion Nebula is the trapezium cluster of young massive stars that shape the cloud of dust and gas with their intense ultraviolet radiation, according to the news release. Understanding how this radiation impacts the cluster’s surroundings is key to understanding the formation of stellar systems.
“Massive young stars emit large quantities of ultraviolet radiation directly into the native cloud that still surrounds them, and this changes the physical shape of the cloud as well as its chemical makeup. How precisely this works, and how it affects further star and planet formation is not yet well known,” Peeters said.
The images will be studied by an international collaboration of more than 100 scientists in 18 countries known as PDRs4All.",
"tags":[],
"authors":[
"Katie Hunt"
],
"meta_image":"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?q=w_800,c_fill",
"meta_description":"""Breathtaking"" images of a stellar nursery in the Orion Nebula taken by the James Webb Space Telescope are revealing intricate details about how stars and planetary systems form.",
"meta_keywords":[
"celestial bodies and objects",
"planets and moons",
"space and astronomy",
"science"
],
"meta_lang":"en",
"meta_favicon":"/media/sites/cnn/favicon.ico",
"meta_site_name":"CNN",
"meta_data":{
"viewport":"width=device-width,initial-scale=1,shrink-to-fit=no",
"og":{
"title":"New 'breathtaking' Webb images to reveal the secrets of star birth | CNN",
"description":"""Breathtaking"" images of a stellar nursery in the Orion Nebula taken by the James Webb Space Telescope are revealing intricate details about how stars and planetary systems form.",
"image":"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?q=w_800,c_fill",
"type":"article",
"url":"https://www.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html",
"site_name":"CNN"
},
"twitter":{
"title":"New 'breathtaking' Webb images to reveal the secrets of star birth | CNN",
"description":"""Breathtaking"" images of a stellar nursery in the Orion Nebula taken by the James Webb Space Telescope are revealing intricate details about how stars and planetary systems form.",
"image":"https://media.cnn.com/api/v1/images/stellar/prod/220912095026-01-james-webb-space-telescope-orion-nebula.jpg?q=w_800,c_fill",
"card":"summary_large_image",
"site":"@CNN"
},
"description":"""Breathtaking"" images of a stellar nursery in the Orion Nebula taken by the James Webb Space Telescope are revealing intricate details about how stars and planetary systems form.",
"template_type":"article_leaf",
"type":"article",
"meta-section":"world",
"meta-branding":"space-and-science",
"theme":"world",
"article":{
"published_time":"2022-09-12T14:00:05Z",
"modified_time":"2022-09-12T14:26:58Z",
"tag":"celestial bodies and objects, planets and moons, space and astronomy, science",
"publisher":"https://www.facebook.com/CNN"
},
"keywords":"celestial bodies and objects, planets and moons, space and astronomy, science",
"author":"Katie Hunt",
"fb":{
"app_id":80401312489
}
},
"html":null
}
}
Errors
Errors
If your request was unsuccessful, you will receive a JSON formatted error. Below you will find the potential errors you may encounter when using the API.
Errors
error code | HTTP status | description |
---|---|---|
malformed_parameters |
400 |
Validation of parameters failed. The failed parameters are usually shown in the error message. |
invalid_api_token |
401 |
Invalid API token. |
usage_limit_reached |
402 |
Usage limit of your plan has been reached. Usage limit and remaining requests can be found on the X-UsageLimit-Limit header. |
endpoint_access_restricted |
403 |
Access to the endpoint is not available on your current subscription plan. |
resource_not_found |
404 |
Resource could not be found. |
invalid_api_endpoint |
404 |
API route does not exist. |
rate_limit_reached |
429 |
Too many requests in the past 60 seconds. Rate limit and remaining requests can be found on the X-RateLimit-Limit header. |
server_error |
500 |
A server error occured. |
maintenance_mode |
503 |
The service is currently under maintenance. |
Example Error Response
{
"error": {
"code": "malformed_parameters",
"message": "The published_before parameter(s) are incorrectly formatted."
}
}
Examples
Code Examples
See our prepared examples below to quickly get started implementing our API into your next project.
PHP
$queryString = http_build_query([
'api_token' => 'YOUR_API_TOKEN',
'url' => 'https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html',
]);
$ch = curl_init(sprintf('%s?%s', 'https://api.articlextractor.com/v1/extract', $queryString));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$json = curl_exec($ch);
curl_close($ch);
$apiResult = json_decode($json, true);
print_r($apiResult);
Python
# Python 3
import http.client, urllib.parse
conn = http.client.HTTPSConnection('api.articlextractor.com')
params = urllib.parse.urlencode({
'api_token': 'YOUR_API_TOKEN',
'url': 'https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html',
})
conn.request('GET', '/v1/news/all?{}'.format(params))
res = conn.getresponse()
data = res.read()
print(data.decode('utf-8'))
Go
package main
import (
"fmt"
"io/ioutil"
"net/http"
"net/url"
)
func main() {
baseURL, _ := url.Parse("https://articlextractor.com")
baseURL.Path += "v1/news/all"
params := url.Values{}
params.Add("api_token", "YOUR_API_TOKEN")
params.Add("url", "https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html")
baseURL.RawQuery = params.Encode()
req, _ := http.NewRequest("GET", baseURL.String(), nil)
res, _ := http.DefaultClient.Do(req)
defer res.Body.Close()
body, _ := ioutil.ReadAll(res.Body)
fmt.Println(string(body))
}
JavaScript
var requestOptions = {
method: 'GET'
};
var params = {
api_token: 'YOUR_API_TOKEN',
url: 'https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html'
};
var esc = encodeURIComponent;
var query = Object.keys(params)
.map(function(k) {return esc(k) + '=' + esc(params[k]);})
.join('&');
fetch("https://api.articlextractor.com/v1/extract?" + query, requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));
C#
var client = new RestClient("https://api.articlextractor.com/v1/extract");
client.Timeout = -1;
var request = new RestRequest(Method.GET);
request.AddQueryParameter("api_token", "YOUR_API_TOKEN");
request.AddQueryParameter("url", "https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html");
IRestResponse response = client.Execute(request);
Console.WriteLine(response.Content);
Java
OkHttpClient client = new OkHttpClient().newBuilder()
.build();
HttpUrl.Builder httpBuilder = HttpUrl.parse("https://api.articlextractor.com/v1/extract").newBuilder();
httpBuilder.addQueryParameter("api_token", "YOUR_API_TOKEN");
httpBuilder.addQueryParameter("url", "https://edition.cnn.com/2022/09/12/world/james-webb-space-telescope-image-orion-nebula-scn/index.html");
Request request = new Request.Builder().url(httpBuilder.build()).build();
Response response = client.newCall(request).execute();