How does OSINT make the difference?

In 2017, a cyclist in Leiden got an argument with a car driver. In the heated debate, the cyclist got violent and threw the pregnant cardriver into the river De Vliet. Clever use of OSINT quickly learned the name and address of the suspect. How did they find the suspect? How can the clever use of OSINT help getting the right intelligence to find answers?

Open Source Intelligence (OSINT) is a integrated, collaborative methodology to retrieve a balanced, representative and validated set of the best possible information from open sources to – after careful scrutinization and analysis – produce actionable intelligence. Intelligence that can be used to support decision makers in managing change. OSINT is used in preparation of peace-keeping missions, in international conflict and security studies, in market intelligence to explore new market opportunities, by law enforcement to find suspects, by banks to vet new customers, and more. The private sector uses OSINT for almost anything that requires good information. Journalists, historians, researches alike. Below we describe what OSINT is, how it works and a few practical examples of applying OSINT.

OSINT is widely misunderstood as a bunch of technical tricks in support of cyber operations or something. It is not. In-depth knowledge of open sources, in-depth knowledge of search systems and search languages are the minimum required for OSINT operations. Running the RIS OSINT Intelligence Cycle will produce powerful intelligence reports that are true force multipliers. The OSINT system described was developed since 1990 when starting to establish the OSINT branch for the Dutch Defence Intelligence \& Security Service. The system became since then known as Arno’s OSINT Methodology.

An OSINT practical example..

Clever use of open source intelligence tools and techniques can be a true force multiplier for any intelligence operation. Knowing how to find relevant and timely information in open sources and how to analyse that to get actionable intelligence can lead to quick decisions at low cost. But only when done by professionals.

In the above example, the cyclist and the cardriver were involved in a minor car accident at a narrow road next to the beautiful river De Vliet. A road often used by cyclists for relaxation and sport alike. Both got involved in a heated argument. At a certain point the cyclist got violent and threw the pregnant car driver into the river De Vliet, leaving her in despair. Clearly a thing for the police to solve.

How did they do that using OSINT?

Police were quick to act. The OSINT department first did some target profiling. What do cyclists do? How do they behave? What is a general characteristic of a cyclist? That one is not too difficult. They like to keep a record of their progress and their achievements. How do they do that? By using a bicycle computer that keeps track of essential data like distance, speed, averages, energy consumption, heart rate, blood pressure, and the like. Some of the more advanced also keep track of progress in training, and the exact route the cyclist has driven, dates, times, if you have an online account that is.

One plus one sometimes lead to more than two. And that is certainly the case in \gls{osint}. The next question was, which commercially available bicycle computers keep track of training progress data online. It turns out that there were only four of those.

How many of these keep the progress data of their customer cyclist data online? All four did. So, it may be a bit of work but nevertheless, the next question was, of those four online bicycle computer databases, how many participants were cycling at the date and time of the incident near river De Vliet? Guess what, only one. Name included. Bingo! Now go to social media to check the name and find an address. Got you!

(Based on a real case. Some details changed to protect privacy)

What is OSINT?

Everybody needs information. Everybody needs good information.

Information that is timely, reliable, actionable and validated. Information that can be used to take decisions. Decisions to yes or no deploy troops on a mission, or decisions to yes or no enter a new business market, or decisions to start a new research project. Law enforcement, financial institutes, international missions, strategic analysts, journalists, (inter) national governments, intelligence services (competitive or general or defence), students, researchers, international missions, and many more, all need a proper foundation or a good information profile to take decisions.

And all of them discover the power of OSINT.

Once properly applied, \gls{osint} can be a true force multiplier once certain conditions are met: A very good understanding of the organisation of the global information landscape, a very good knowledge of open sources (where to find what), very good skills in searching databases and query logic, and obviously a very good understanding of the initial requirements.

In short, OSINT is creating a as perfect as possible match between the supply side and the demand side of information.

OSINT is a methodology developed by me when acting head of the Open Source Intelligence Branch of the Dutch Defence Intelligence and Security Service. Back in 1990 when being invited by the DISS to establish their OSINT branch, it became clear that there was no such thing as OSINT in Europe yet. Everything had to be developed from the ground up. Gradually, an OSINT methodology was developed, a system of getting timely, validated and reliable information in the correct time and correct format to the customer. In this case, strategic analysts.

The official definition is:

OSINT is a collaborative, integrated methodology and production process where customers' intelligence requirements are met by providing them with actionable intelligence that is produced through a process of synthesis and analysis based on a representative selection of open source information that is validated, reliable, timely, and accurate.

The object of OSINT, the information being collected and analysed, is called open source information (OSINF). The official definition is:

OSINF or simply open sources, is all information in the public domain, in any format, that can be acquired by anyone without any restrictions, whether for free or commercial, in a legal and ethically acceptable way.

There are some restrictions:

  • Legal basically means no hacking, no ‘computer network exploitation’, no password cracking etc. All must be done in a legal way.
  • Ethical means that some open sources happen to be in the open domain that are clearly not intended to be there. You may realise that considering the amount of open sources available, the in-depth professional OSINT research techniques being used, the OSINTian may find information that was not supposed to be found from OSINF. That information is not considered OSINF and thus not part of OSINT. Examples are USB sticks found in the back of a taxi cab, information released by accident, or Wikileaks. That information was stolen, edited and manipulated before being published on the Net. It is therefore not part of OSINT. No misunderstandings here, the information is interesting and will be used (probably) in an intelligence context, but it is not pure OSINF.

 

How much information?

How much open source information is there out there anyway? Much. Very much. Very very much as a matter of fact.

OSINF

OSINF is not just the Internet(1) like so many seem to think. It is also printed sources, or radio, or TV. It is also human sources, discussion groups, national archives, registries, etc. It is also handbooks, libraries, NNTP, FTP, radio, the Deep web. Almost all ‘information’ out there is OSINF. Do not make the mistake of thinking that OSINT is limited to just the world wide web. It is everything. The Internet is only a small part of the OSINF domain.

Almost all information comes from open sources one way or the other. What often happens however is, that intelligence workers take information from the open domain, classify that information as secret then to claim that the information came from a secret source. There is no such thing as secret information. HUMINT, SIGINT, MASINT. IMINT all make exclusive use of information that can legally and ethically be obtained by anyone. Hence, open source information.

One of the largest libraries in the world, the Library of Congress, holds more then 170.000.000 items. They collect (almost) everything, from books, maps, CD’s, records, journals, newspapers to about anything. The British Library, the Vatican Library, the Library of St. Petersburg, all the same thing. Massive amounts of information all systematically organised and validated.

Commercial information providers such as Lexis-Nexis, Proquest Dialog and Factiva, hold thousands and thousands of databases full text online. Lexis-Nexis for example, currently offers more then 38.000 databases with an estimated total number of documents exceeding 8.500.000.000. Scopus (Elsevier) holds a database of scientific peer-reviewed literature, 71.000.000 records and 1.500.000.000 references in all. The Web of Science (Web of Knowledge) holds 171.000.000 records with 2.000.000.000 references. All information is validated, no dead links, no duplicates, no desinformation or fake information.

Never ever ask an OSINTian or librarian to ”find everything about…”. I am serious here. I once did. With someone I did not like.

Internet

The Internet is only a small part of what is available in the global information landscape.

Thinking that all information is available in electronic format and indexed by an Internet search engine is an assumption. And a wrong one too. The Internet has a few  services or ‘sources’ like I like to see them of valuable information, such as FTP, NNTP, IRC, POP and SMTP.

Internet Relay Chat (IRC) is still widely used, for instance by child traffickers. But also by computer programmers and many others. It used to be very much larger then it is today, but still. Serious investigators simply cannot afford to miss IRC groups.

NNTP holds tens of thousands discussion groups used until today, about any conceivable subject. The binary part is mostly the illegal part were stolen books, software, music etc is offered and asked for. Some famous football club fan groups have their discussions in NNTP like I recently discovered when doing research in hooliganism.

FTP (File Transfer Protocol) is used to upload and download files to or from public Internet servers. Like university computers. Some of them are so large that it is pretty difficult to find out what exactly is going on.

Telnet is used to run software on someone else’s computer. It can be used to go online to use a library catalogue, an online database, or for investigators to check the validity of e-mail addresses without sending an e-mail.

POP is not just about reading e-mail. Thousands of ListServ discussion groups make use of of POP and SMTP to manage discussions via e-mail. I am a big fan of some very usefull listserv discussion groups for librarians, discussing sources and problems. All the above services may hold valuable information for the OSINT researcher.

The HTTP (world wide web)  is but one service on the Internet, it is large, there is a lot, but it is only one.

Google

Luckily, we have Google that makes the entire world wide web retrievable. Is it not?

We have seen already that the world of open source information is very much more then just the Internet. First of all, there is the pre-Internet world of printed information and information available online via commercial information providers. The ‘Internet’ is only part of that. The Internet consists of many different services as we have previously seen. The WWW is only one of them..

The WWW consists of a wide array of web based services, a large part of which is in the so called deep web. The deep web is the part of the WWW that is not indexed by search engines. In other words, the part of the WWW that you will not be able to find (unless you know the exact URI)..

The question then is, how large is the deep web with respect to the total www? According to Sherman and Price(2), the deep web is estimated to be 50-500 times larger than the WWW. This implies that only 2\% or even just 0.2\% of the WWW can be found using a search engine.

So how many Internet search engines are there and how large is the overlap? The answer to the first one is easy: less then 10. Surprisingly, the overlap between the major search engines is just 20%, an exact explanation for this is unknown.

And Google? Google is just one of these search engines.

Now you realise how limited the Google search engine is with respect to the overall global information landscape? There is an overwhelming amount of open sources available. One tiny part is the Internet which consists of many services amongst which the world wide web. About 98% of the world wide web is in the deep web, only 2% or so is indexed by search engines. The overlap between these search engines is around 20%, and Google is just one of them. Even worse, large parts of the information contained in a free Internet search engines is unvalidated and not checked for reliability.

Google is large, very large, but in OSINF, as a information source, it plays a minor role. Thinking that the answer you are looking for is available somewhere on the Internet is an assumption. Thinking that within that Internet domain Google has the answer is the second assumption. Both are wrong.

And now that we have mentioned the phrase ”search engine”, I am quit often asked what is the best Internet search engine out there. The answer is self-explanatory: your local librarian. They have access to unimaginable printed and online sources, they know their way throught these sources to find the exact answers you were looking for. Never underestimate a skilled librarian.

1) Technically, the Internet is a network of computers all running the TCP/IP protocol. In this paper we use the phrase Internet mainly for information sources found on the Internet.

2) The Deep Web / Gary Price, Chris Sherman.

 

How to do OSINT

The OSINT Intelligence Cycle is a perfect explanation of how an OSINT research process is run. In short, there are three stages in an OSINT process. The system was developed since 1990 when starting to create the OSINT Branch for the Dutch Defence Intelligence & Security Service. The system since then became known as Arno’s OSINT Methodology.

The cycle consists of three parts or wings, with the customer in the middle. In many existing intelligence cycles, the customer is not even mentioned and plays no role in the process. In our opinion, any intelligence production process is pointless if there is no match with a customer. The customer is the pivotal point of all OSINT operations.

The first wing is the exact information requirement of the customer. It’s three parts make sure that the OSINTresearcher knows exactly what the precise requirements of the customer are. Since the requirements can change quickly over time, the loop back makes sure that the outcome of the first wing is matched with the current requirements. If there is a match, we can continue to the second wing. If not, we shall do the first wing again.

The second wing is the search and collection phase. It is about creating a workable collection plan, validating the retrieved information on certain quality and reliability criteria, monitoring developments, a fusion phase were all validated information from a large variety of sources is brought together and deduplicated. Finally, the information is indexed with meta data to make it internally retrievable. Then we go back to the customer again to match the results with the requirements. If the customer is happy we can stop at this point. If not, we can do the second wing again. If the customer likes to have an analysis of the results, we continue with the third wing.

The third wing is the analysis and distribution wing. Here, the open source information results are matched with information from other, closed or covert, sources and analysed. We call this step synthesis, since analysis is done at every step. Presenting and distribution of intelligence are obvious challenges on their own. A fantastic OSINT report with a great analysis can be completely destroyed by a poor presentation (either orally or in writing). And distribution can be a challenge too, since in the intelligence community there still is up to today a strong tendency to write fantastic intelligence reports, but then not publish them, because the content is secret. The last step is knowledge management, recording the lessons learned, the methodologies used, the know how, in such a way that the knowledge obtained can be used again the next time.

 

Cases of OSINT

Let’s take a look at two real examples of what practical OSINT looks like.

The first is on trying to find out if Iran is actively working on the development of chemical weapons. It is a wonderful example of a combination of open sources, free as well as fee-based OSINF with advanced search techniques. This example demonstrates that OSINT is about knowing sources, that OSINT is about knowing online databases, that OSINT is about advanced search techniques. It also shows the role of internet search engines such as Google. Google is not used to find the answers to questions, Google cs is used to find authoritative sources.

The second one is a little bit more technical. It shows how to check a webpage for validity finding out that the person does not exist and the information on that page is fake. But let’s start with Iran first.

Iran and chemical weapons

The assignment is clear: we like to find out if Iran is importing raw materials for the production of chemical weapons. How to proceed using the OSINT methodology?

The first step is, find out what exactly is a chemical weapon? Sources to be used here are the OPCW (Organisation for the Probibition of Chemical Weapons) to look up a general description of chemical weapons. We can also use the FAS (Federation of American Scientists) to do that, or any other handbook or organisation like the World Health Organisation, the US National Library of Medicine, the Center for Disease control and Prevention, or even a good old newspaper like the New York Times. What we learn from this is, that ordinary pesticides can be used as chemical weapons.

The second step is, how many pesticides are there? What are their names? Are their any synonyms for pesticides? We can use a variety of OSINF here. Either use handbooks, encyclopedias or some authoritative source. We can also make use of fee-based OSINF, like Proquest Dialog for instance. We can use one (or more) of their databases for chemical substances like Derwent Chemistry Resource, or Beilstein facts, ChemSearch to name a few. Also consider Pesticide Factfile. The query looks somewhat like the following:

? B 390, 398, 355, 306
? S PESTICIDES/na,de
? MAP SY T S1
? SAVE TEMP

All these databases allow searching on nicknames, synonyms etc. to return a list of alternative names of all kinds of pesticides, insecticides and the like. On line 1 we start the appropriate databases, in line 2 is the query to search for the phrase pesticides in the name field and in the descriptor field, in line 3 we save all the synonyms temporarily in set 1, and in the final line we save this set temporarily on the Dialog server.

Now that we have a list of common nicknames, synonyms etc. for pesticides, we proceed with database number 571 Piers Exports. Let’s do a search in this database to find out how many pesticides were exported from US harbours to Iran by launching our earlier saved query, sorting the records and creating a online report from all this. The queries look like:

? B 571
? EXS
? SORT S1/ALL/CN,LB,CO
? REPORT S2/ALL/CN,LB,CO

This will produce a report that shows how many pesticides were exported to what country, with export date and manufacturers name.

Country of
Non-U.S. Weight                                   Date
Port     (Pounds)    U.S.-based Company           Shipped
---------- -------- ----------------------------- -------

ARGENT   5,388      BAYER                         031214
ARGENT   7,771      NA                            041226
ARGENT   8,049      E I DUPONT DE NEMOURS         040527
ARGENT   8,049      E I DUPONT DE NEMOURS         040527
ARGENT   9,720      BAYER                         040718
ARGENT   11,661     NA                            050407
ARGENT   15,878     NA                            041126

The list gives an estimate of goods transported to other countries. Doing some analysis of import and export from and to certain countries will eventually lead to the goal.

So if this first rough estimate of the number of pesticides exported to Iran is now known, lets move on to step four. How much pesticides are within reason needed – on average – per hectare agricultural land? Again, let’s find a authoritative source for this one. Such as the European Environmental Agency which gives a total pesticide consumption per hectare agricultural land.

If the rough estimate of required amount of pesticides is now known, the next question is how much agricultiral land does Iran actually have? In comes the good old CIA World Fact Book, to learn that Iran has about 9.78% arable land.

Finally, what follows is a little computation. We have 1) the amount of import of pesticides into Iran, t2) the number of pesticides needes per hectare of agricultural land, and 3) the amount of agricultural land in Iran. What follows is a very simple calculation: \texttt {1 – (2 * 3)} to find that Iran is importing way to much pesticides to within reason be used for agricultural purposes.

The uninitiated may start to think that this looks like a poor analysis. It is not. It has nothing to do with analysis. The example only shows part of a solution to find authoritative information. That’s it. It is a lovely example of combining free open sources with fee-based open sources to get answers to questions. Which is what OSINT is all about.

{This example was first presented by the author at a NATO OSINT Meeting in Washington D.C. in 1995. The example is edited a little since then.)

 

Mr Bervoets

How to make money fast via the Internet?

Simple, just browse the website of mr Frederik Bervoets who explains how he as a lorry driver with no more than secondary school discovered a method to get really rich very quickly. He created a manual on how to do that and made it public, available for everybody. It is really easy to get rich he explains, there is no startup costs and he does not sell any products. There is also a lovely picture of the man having a drink on a beach.

The question is, is this true? Is the website real or could this website be a scam? Using OSINT techniques it quickly becomes clear what is going on.

The first thing we do is a reverse image search on the picture. Reverse Image Searching is a technique supported by most Internet search engines to search on graphics to find websites with the same or similar graphic. The technique is useful to validate the reliability of images, to find the origin of images and possible manipulation of images.

The result is most interesting. Google returns a bunch of websites with exactly the same picture and the same text, but in a different language and with a different name for the author. The man is called Scott Evans, or, Edgar Morgan, or, Thomas Stodola. He is also called Andy somewhere. The text on all these websites is almost identical, except that the age of the author differs.

Another clue that may be an indication that this website is a scam is checking the domain name. The domain name is frederikbervoets.com. The top-level domain is .com. which (back then) draws attention since the website is in Dutch and aimed at The Netherlands. Why is the top-level domain not simply .nl? Lets do a WHOIS domain name to find the owner or registrant of this website to get a clue. The domain record shows that it is registered via Domains by Proxy, LLC, a subsidiary of GoDaddy.com, one of the largest providers in the world. Domains by Proxy is a company that will register domain names without mentioning the identity of the owner. That is interesting. Why would Frederik Bervoets do that? What is there to hide?

Let’s use another tool to find out more. Traceroute will show all the nodes between our computer and the target website (unless firewalls pop up). Just take the final IP address and do a lookup on that address. In this case, we find an address Smallmead Road, in Reading, Berkshire, United Kingdom. With telephone number, e-mail address and more

Again, a simple example of using OSINT techniques to find out more about something. This example shows that OSINTians not just need a plan of action, they also need some technical skills too. Know about the organisation of the Internet, domain names, IP addresses and network topology. This is not cyber skills, that goes very much further.

 

 

 

Conclusion

OSINTians, librarians, when properly trained in the professions or the Art of Information, and with sufficient research experience, can play a vital role in almost any organisation. Be it government, semi government or the private sector, skills to find the golden nuggets out there and being able to report finding properly are vital to today’s decision makers. Developments in the field of OSINT (subject of another little paper) such as artificial intelligence, globalisation, languages etc will give the OSINTian the proper tools and means to excel and make the difference. Anytime. Anywhere.

Who truly was the most dishonest president?

Former President Donald Trump was often accused of having a complete disregard for the truth. Yet some of his predecessors’ falsehoods ranged from the bizarre to the horrifying. So how does Trump truly compare?

When Saddam Hussein invaded the oil-rich emirate of Kuwait in August 1990, President George HW Bush snarled: “This will not stand.”

But as US troops were scrambled to the Gulf, the American public was dubious about the justification for military action.

The Kuwaiti government-in-exile promptly hired a US public relations firm, Hill & Knowlton, whose Washington DC office was run by Bush’s former chief of staff.

[more]

https://www.bbc.com/news/world-us-canada-56246507

 

Five search tricks to get better results on Google

The Star, Sunday, 27 Dec 2020

When it comes to finding exactly what you’re looking for on the multiverse of websites out there, it’s all about little things called operators.

If you ever have to go beyond page one of Google’s search results, then it’s time to crack open the special toolbox for better online searches. These tweaks to your search will help the search engine better understand what you’re looking for.

1: Universal Google searches, not country-specific ones

Before you start using operators, you need to make sure you’re getting results from the right region. That’s because whenever you carry out a search with Google.com, you’re automatically taken to the search results for the country you’re searching from.

But sometimes you may want to carry out a wider search to get results that aren’t narrowed down by your location.

Of course, there’s a way to do this. The trick is to enter google.com/ncr in the address line. The “ncr” after the domain stands for “no country redirect.”

However, you’ll only get neutral search results if you’re not logged into your Google account. If you don’t want to log out, you can open another browser or else open a new tab in private mode in browsers like Firefox, Chrome or Opera.

2: Minus – say what you don’t want

The minus symbol followed immediately by a certain word will help you exclude search results you don’t want. If you type in “Spaghetti Carbonara -cream” you’ll get links to Carbonara recipes that don’t have cream in them.

Combining a variety of these operators will also allow you to do things like search a website for any mention of precisely your name, and not another name with a similar spelling.

There are many of these handy search parameters and they can be found on the support pages of search engines such as Google and Bing.

3: Quote marks – just search this exact phrase

Asterisks and quote marks around the search words are our way of telling a search engine to only search for things with exactly this exact phrase.

A search, for example, for “Portland, Oregon” will help you find results about only that city and no other city called Portland. Otherwise, search engines will usually interpret the spaces between the search terms as “and”.

4: Filetype – Only search in these kinds of files

If you want the search to be for a certain file format, you can work with the filetype command.

For example, if you add the filetype:pdf after the search term, the search engine will only display PDF documents in which the search terms occur. This can look like this in the search window: travel checklist filetype:pdf.

5: Site – Search just this website

Your searches don’t always have to be across the whole Internet. If you only want to search a specific website, simply precede the search term with site:[website domain] without the www.

Something like “site:gov.uk Malaysia” will return all results on British government websites where Malaysia is mentioned. – dpa

Google’s top trending US search terms of 2020: ‘election results’ and ‘coronavirus’

Richard Nieva

cnet.com

It’s been an unprecedented year. The world faced a deadly pandemic and the US held its most contentious election in recent history. Those things dominated our focus as we searched the internet in 2020. 

Google on Wednesday released its list of top trending search terms for this year. On top was the phrase “election results.” Coming in at No. 2 was coronavirus. Rounding out the top 3 was Kobe Bryant, the NBA legend who died in a helicopter crash in January. In Google parlance, “top trending” means the terms had the highest spike in traffic over a certain period of time this year compared with last year.

Power up your Android

Get the latest news, how-to and reviews on Google-powered devices in CNET’s Google Report newsletter.

Google is the world’s largest search engine and the most visited site on the internet, so its popular search queries give us a good look into what people were thinking about over the past year. Last year’s top search was Disney+, the streaming service that launched last November. Another top search was Nipsey Hussle, the LA rapper known for his community service who was killed last year. 

It’s no surprise that election results topped Google searches. The company partnered with the Associated Press to display tabulations in real time. The intrigue dragged on long past election night, as counting continued in the following days because of a surge of mail-in ballots. 

But while people flocked to Google search for election results, the company was criticized for letting election misinformation run rampant on YouTube, which Google owns, in the days after the contest. More than a month later, President Donald Trump still hasn’t conceded to President-elect Joe Biden. (Biden was the top trending person and politician searched on Google this year.)

The coronavirus also dominated web searches. Aside from being the top entry, “coronavirus update” and “coronavirus symptoms” were the No. 4 and 5 searches of the year. Earlier on in the pandemic, Google launched a coronavirus hub for its search engine, highlighting statistics, as well as information about testing. 

Below are the full lists:

Searches

  1. Election results

  2. Coronavirus

  3. Kobe Bryant

  4. Coronavirus update

  5. Coronavirus symptoms

  6. Zoom

  7. Who is winning the election

  8. Naya Rivera

  9. Chadwick Boseman

  10. PlayStation 5

News

  1. Election results

  2. Coronavirus

  3. Stimulus checks

  4. Unemployment

  5. Iran

  6. Hurricane Laura

  7. Super Tuesday

  8. Stock market

  9. Murder hornet

  10. Australia fires

People

  1. Joe Biden

  2. Kim Jong Un

  3. Kamala Harris

  4. Jacob Blake

  5. Ryan Newman

  6. Tom Hanks

  7. Shakira

  8. Tom Brady

  9. Kanye West

  10. Vanessa Bryant

Politicians 

  1. Joe Biden

  2. Kamala Harris

  3. Boris Johnson

  4. Pete Buttigieg

  5. Mike Bloomberg

  6. Andrew Cuomo

  7. Chris Christie

  8. Mike Pence

  9. Andrew Yang

  10. Mitt Romney

You.com Search Engine Announced to Take on Google Search, Founder Says Will Not Rely on Ads for Results

The search engine is reportedly built using advanced natural language processing for refined relevant search results without having to rely on advertising.

 

You.com, a new search engine, has been announced to take on Google Search. This new search engine has been made by former Salesforce chief scientist Richard Socher. In a world where consumer search is plagued with clickbait content for monetary gains through advertising, You.com looks to be a trusted search platform with privacy controls, legit reviews, and AI-driven comprehensive results. The search engine is reportedly built using advanced natural language processing for refined relevant search results without having to rely on advertising.

The You.com website is live, but is currently taking registrations for early access. The site says You.com offers privacy controls to let users customise their browsing experience. The company says it ‘never sells your data to advertisers or follows you around the rest of the Internet.’ You.com also takes into account your values, helps you give back and support the right causes with the right tools, and makes it easier for you to search and shop according to your values. The site claims to offer trustworthy reviews from real users and experts, letting you know both the pros and cons of a product. You.com claims to also offers faster results, with priority given to real results over paid content and ads.

Socher spoke to TechCrunch about You.com. He said, “We are building You.com. You can already go to it today. And it’s a trusted search engine. We want to work on having more click trust and less clickbait on the internet.” Socher goes on to add that You.com was conceived over the need to offer relevant and accurate search results a priority from the vortex of information that is available online. The need for user data privacy is also increasing and has been of significant importance in 2020 as more of the world moved towards the Internet.

The former Salesforce employee asserts, “The biggest impact thing we can do in our lives right now is to build a trusted search engine with AI and natural language processing superpowers to help everyone with the various complex decisions of their lives, starting with complex product purchases, but also being general from the get-go as well.” The principal differentiator from Google Search will be that You.com will not rely on advertising or what it knows about the user to throw results.

 

OSINT training programmes and workshops

Following the success of our Open Source Intelligence Pathfinder range of training programmes, we have now planned one full OSINT Pathfinder training programme almost each month in 2021.

The schedule looks as follows:

OSINT Pathfinder XXX 12-14 january 2020 Venue: Novotel The Hague city center
OSINT Pathfinder XXXI 16-18 February 2020 Venue: Novotel The Hague city center

 

 

6 Google search alternatives that respect your privacy

Kim Komando  |  Special to USA TODAY        November 2020

Between Google Search, Gmail, Google Maps, and all the rest, the tech giant knows a ton about you. Let’s not forget about YouTube, the second-largest search site behind Google.

I recently showed you how you could take control of what appears when you search for yourself. Once you find what’s publicly available about you, take steps to delete anything that doesn’t sit well with you, from images of your home to personal photos. Here’s my guide to doing an exhaustive search:

All this tracking and information gathering might have you looking for solid alternatives to Google. If you’re ready to make a change, try a few out and see what you like.

1. StartPage

StartPage calls itself “the world’s most private search engine.” The Netherlands-based company recognizes that when it comes to search, it’s hard to beat Google. That’s why they use the power of Google without passing along user tracking.

StartPage pays Google for the use of its search algorithm but strips out the tracking and advertising that usually comes along with it. You get a Google-like experience, along with the promise that your data will never be stored, tracked, or sold.

Test it out at startpage.com. You can also set StartPage as your browser’s default search engine.

NOT JUST SEARCH: Want to ditch Google Chrome and Gmail, too? Here are some great alternative browsers, email services, maps apps and more.

2. Ecosia

Ecosia takes an entirely different approach. It’s a traditional search engine, ads and all, but the money raised is used to make the world a greener place. When you search on Ecosia, you’re helping to plant trees all around the world.

A nice bonus if you’re privacy-conscious: Ecosia doesn’t sell your data, searches are encrypted, and search data is anonymized within a week. They do collect “a small amount of data” by default, but you can opt-out.

Search on ecostia.org or you can add an extension to your computer or mobile browser.

3. Dogpile

While Google uses an algorithm to sort through billions of webpages, Dogpile instead fetches results from the major search engines. Google, Yahoo, Bing, and the rest have their ways of sorting through results, and Dogpile analyzes them all to help you find what you’re seeking.

Try it out at dogpile.com. Type in what you want to search and hit “Go Fetch!”

4. DuckDuckGo

This search site is the likely most well-known privacy-focused one of the bunch. DuckDuckGo doesn’t track users, so it’s not clear exactly how many people use it. However, the CEO estimates about 25 million users.

Why does it stand out? DuckDuckGo doesn’t track you the way Google does, it doesn’t allow targeted advertising, research results are not based on your search history, and you’ll see fewer ads based on your search.

It’s easy to use and install, too, with an extension that plugs in with all the major browsers. You can also search at duckduckgo.com.

5. Kiddle

If you have little ones at home, consider Kiddle. It’s not affiliated with Google, but Google Safe Search powers it.

The visual search engine promises a safe web environment for kids, with big thumbnail images and bigger text for easy reading. The first few results of any given search are pages specifically written for children and approved by Kiddle editors. The next few results are safe but may not be explicitly written for little ones.

Kiddle has some fun extras like a 700,000 article encyclopedia with searchable topics ranging from the sciences to the arts.

The search engine doesn’t collect any personally identifiable information, and its logs are deleted every 24 hours. There are ads, though.

Try it out at kiddle.co.

6. Wolfram Alpha

Think of Wolfram Alpha as a genius in your browser. You type something you want to know or calculate, and it goes to work finding you an expert-level answer. How? A combination of algorithms, AI tech, and an extensive database.

This site isn’t the place to go if you want to find a plumber or restaurant reviews. But if you need an answer to a math problem, want trustworthy information on world history or events, or need to do personal finance or household math, give it a shot.

Can WolframAlpha answer your question? Search at wolframalpha.com to find out.

Privacy bonus: Wipe out your Google history

If you haven’t reviewed your Google privacy settings in a while, now’s the time to do it. I bet you’ll be shocked by all the searches, locations, and voice messages on file.

Online research: How students can do this better

 

Searching online has many educational benefits. For instance, one study found students who used advanced online search strategies also had higher grades at university.

But spending more time online does not guarantee better online skills. Instead, a student’s ability to successfully search online increases with guidance and explicit instruction.

Young people tend to assume they are already competent searchers. Their teachers and parents often assume this too. This assumption, and the misguided belief that searching always results in learning, means much classroom practice focuses on searching to learn, rarely on learning to search.

Many teachers don’t explictly teach students how to search online. Instead, students often teach themselves and are reluctant to ask for assistance. This does not result in students obtaining the skills they need.

More about

Wageningen University: Creating green experts in management, economics and consumer studies

For six years, I studied how young Australians use search engines. Both school students and home-schoolers (the nation’s fastest-growing educational cohort) showed some traits of online searching that aren’t beneficial. For instance, both groups spent greater time on irrelevant websites than relevant ones and regularly quit searches before finding their desired information.

Here are three things young people should keep in mind to get the full benefits of searching online.

1. Search for more than just isolated facts

Young people should explore, synthesise and question information on the internet, rather than just locating one thing and moving on.

More about

What to expect in 2021: 5 hardest degrees in the UK

Search engines offer endless educational opportunities but many students typically only search for isolated facts. This means they are no better off than they were 40 years ago with a print encyclopedia.

It’s important for searchers to use different keywords and queries, multiple sites and search tabs (such as news and images).

Part of my (as yet unpublished) PhD research involved observing young people and their parents using a search engine for 20 minutes. In one (typical) observation, a home-school family type “How many endangered Sumatran Tigers are there” into Google. They enter a single website where they read a single sentence.

The parent writes this “answer” down and they begin the next (unrelated) topic – growing seeds.

The student could have learned much more had they also searched for

  • where Sumatra is
  • why the tigers are endangered
  • how people can help them.

I searched Google using the keywords “Sumatran tigers” in quotation marks instead. The returned results offered me the ability to view National Geographic footage of the tigers and to chat live with an expert from the World Wide Fund for Nature (WWF) about them.

Clicking the “news” tab with this same query provided current media stories, including on two tigers coming to an Australian wildlife park and on the effect of palm oil on the species. Small changes to search techniques can make a big difference to the educational benefits made available online.

 

More can be learnt about Sumatran tigers with better search techniques. Source: Shutterstock

2. Slow down

All too often we presume search can be a fast process. The home-school families in my study spent 90 seconds or less, on average, viewing each website and searched a new topic every four minutes.

Searching so quickly can mean students don’t write effective search queries or get the information they need. They may also not have enough time to consider search results and evaluate websites for accuracy and relevance.

My research confirmed young searchers frequently click on only the most prominent links and first websites returned, possibly trying to save time. This is problematic given the commercial environment where such positions can be bought and given children tend to take the accuracy of everything online for granted.

Fast search is not always problematic. Quickly locating facts means students can spend time on more challenging educational follow-up tasks – like analysing or categorising the facts. But this is only true if they first persist until they find the right information.

3. You’re in charge of the search, not Google

Young searchers frequently rely on search tools like Google’s “Did you mean” function.

While students feel confident as searchers, my PhD research found they were more confident in Google itself. One Year Eight student explained: “I’m used to Google making the changes to look for me”.

Such attitudes can mean students dismiss relevant keywords by automatically agreeing with the (sometimes incorrect) auto-correct or going on irrelevant tangents unknowingly.

Teaching students to choose websites based on domain name extensions can also help ensure they are in charge, not the search engine. The easily purchasable “.com”, for example, denotes a commercial site while information on websites with a “.gov”(government) or “.edu” (education) domain name extension better assure quality information.

Search engines have great potential to provide new educational benefits, but we should be cautious of presuming this potential is actually a guarantee.The Conversation

By Renee Morrison, Lecturer in Curriculum Studies, University of Tasmania

 
 

Google fined £91m over ad-tracking cookies

Google has been fined 100 million euros (£91m) in France for breaking the country’s rules on online advertising trackers known as cookies.

It is the largest fine ever issued by the French data privacy watchdog CNIL.

US retail giant Amazon was also fined 35 million euros for breaking the rules.

CNIL said Google and Amazon’s French websites had not sought visitors’ consent before advertising cookies were saved on their computers.

Google and Amazon also failed to provide clear information about how the online trackers would be used, and how visitors to the French websites could refuse the cookies, the regulator said.

It has given the tech giants three months to change the information banners displayed on their websites.

If they do not comply, they will be fined a further 100,000 euros per day until the changes are made.

In a statement published by Reuters, Google said: “We stand by our record of providing upfront information and clear controls, strong internal data governance, secure infrastructure, and above all, helpful products.

“Today’s decision under French ePrivacy laws overlooks these efforts and doesn’t account for the fact that French rules and regulatory guidance are uncertain and constantly evolving.”

https://www.bbc.com/news/technology-55259602