Simplifying Concepts

Essence of Life is in Simplicity

Difference between Classification and Clustering

Difference between Classification and Clustering?
Compare Classification and Clustering?

Differentiate/Compare/Difference between Classification and Clustering.



It involves the task of assigning instances/items/points to pre-defined classes.

It involves the task of grouping related points together without labelling them.

Labelling is the priori activity.

Labelling the group of points is posteriori activity.

We have a training set containing data that has been previously categorised.

We do not know the characteristic of similarity of data in advance.

Classification algorithm requires training data.

Clustering algorithm does not require the training data.

Thus based on the training data the algorithm finds the category that new data point belongs to.

It involves the usage of statistical concepts that splits the datasets into sub-datasets such that they have similar data.

There is a concept of response or decision variable used.

There is no existence/concept of response variable.

Since training set exists we describe it as supervised learning.

Since no training set is used, it is also described as unsupervised learning.

Example1- An insurance company trying to assign customers to high and low risk categories.

Example1- An online shopping mart recommending the books based on the other customers who had brought the similar books in past .

Example2- Deciding whether particular patient record can be associated with a specific disease.

Example2- Grouping patient records with similar symptoms without knowing what the symptoms indicate.

Note: The above differences have been derived from the above link through a proper understanding. So please share the link of this webpage as sharing is a way of spreading knowledge. But, please do not copy & paste it in other Website or Forums.

Difference between Periodic and Incremental Crawler

Differentiate/Compare/Difference between PERIODIC & INCREMENTAL Crawlers.

Differentiate between Periodic & Incremental Crawlers?
Compare Periodic & Incremental Crawlers?

Periodic Crawler

Incremental Crawler

It visits the web until the desired number of pages is in collection and stops once the collection reaches its target size.

It keeps visiting pages even after collection reaches its target size to incrementally update/ refresh the local collection.

It operates in a batch mode.

It operates in a steady mode.

It runs in a periodic fashion and updates all the pages in each crawl.

It runs continuously without pause and usually refreshes local collection.

Thus it involves building brand new collection and replacing the old with new collection.

It refreshes existing pages in collection and replaces less important with new and most important pages.

It indexes/collects pages usually after a week or month.

It indexes/collects pages in timely fashion say, daily.

It is easy to implement.

It is relatively difficult to implement.

It is less effective and appears to be less intelligent.p>

It is more effective. Say, Google search engine crawler.

It has fixed frequency.

It has variable frequency.

Since entire collection needs to be replaced with new, it imposes heavy overhead/load on network/server.

Since only a part of collection is replaced, it imposes less overhead/loss on network/server.

The freshness of the crawler is not stable.

The freshness of crawler collection is stable..

It can index new page only after next crawling cycle starts.

It can index new page right after it is found by or submitted to the search engine.

For further understanding and detailed conceptual clarity of the topic you can refer: : Further there is even diagrammatic representation of freshness of crawlers in this above link that can be used as a point to differentiate between Periodic and Incremental Crawler.

Note: The above differences have been derived from the above link through a proper understanding. So please share the link of this webpage as sharing is a way of spreading knowledge. But, please do not copy & paste it in other Website or Forums.

Data Warehouse Mining Viva

DWMBI (Data Warehouse Mining & Business Intelligence) Viva Questions:
Hey guys, welcome back we @ conceptSimplified are devoted to simplify your task. Following are few of the Most frequently asked Viva questions in DWMBI and would be extremely useful if you are specifically from Mumbai University.

Data Warehouse Mining Viva
CH-1 Introduction to Data Mining

  • What is data mining?
  • What are functionalities of Data Mining?
  • Classify Data Mining systems
  • Explain Integration of Data Mining system with a Database or Datawarehouse
  • Architecture of Typical Data mining System?
  • What are major Data Mining Issues?
  • Explain KDD process?

CH-2 Data WareHousing

  • What is Data warehousing?
  • Difference between Database and Datawarehouse?
  • Explain Star Schema?
  • Explain Snow flake Schema?
  • Difference between Star and snow flake schema?
  • Explain Fact and Dimensional table?
  • Explain Factless Fact table?
  • What is OLAP?
  • What is OLTP?
  • State applications of OLAP.
  • Difference between OLAP and OLTP?
  • State OLAP operations?

CH-3 Data Preprocessing

  • Difference between clustering and classification?
  • What is Data cleaning?
  • What is Data Integration?
  • What is Data Transformation?
  • State Data Reduction techniques ?
  • Explain Data Reduction techniques in short?
  • Which techniques are used for Numerosity Reduction?
  • Give ways of handling Noise Data?
  • What is noisy data?

CH-4 Mining Frequent Patterns,Associations and Correlations

  • What is Market basket analysis?
  • Explain Apriori Algorithm?
  • Explain FP-Tree Algorithm?
  • How FP-Tree is better than Apriori Algorithm?
  • Give formula of Support and Confidence?
  • K-mean Algorithm
  • What are Constraint based association rule mining?
  • Explain mining multilevel association Rules?
  • Explain mining multidimensional association Rules?

CH-5 Classification and Prediction

  • What is Classification?
  • What is Prediction?
  • State Issues regarding Classification and Prediction?
  • State various classification methods?
  • What is Regression?
  • State types of Regression?
  • Explain Linear Regression?
  • Explain Non Linear Regression?
  • Give formulae of a)Information Gain b)Entropy c)Gini index?
  • Explain Decision tree in brief?
  • Explain Bayesian classification?

CH-6 Cluster Analysis

  • What is clustering ?
  • What is Clustering Analysis?
  • State types of data in Clustering Analysis?
  • State Categories of clustering methods?
  • State partitioning methods?
  • What is BIRCH?
  • What is ROCK?
  • Explain DBSCAN?
  • Explain K-means?
  • Explain K-mediods?
  • Explain Agglomerative Clustering?
  • Explain Outliers Analysis?

CH-7 Mining Stream and Sequence Data

  • What is Stream Data?
  • Explain Association mining in stream data?
  • Explain Sequence Mining in transactional database?
  • Explain different Data Stream methodologies?
  • Explain Hoeffding Tree Algorithm?

CH-8 Spatial Data & Text Mining

  • Compare Data Mining and Text Mining.
  • State Spatial Clustering methods.
  • What is Spatial OLAP?
  • What is Spatial data mining?
  • State different approaches in Text Mining?
  • State Spatial Clustering Methods?
  • What is Web mining ?
  • What is Web Content Mining?
  • What is Web Structure Mining?

CH-9 Data Mining for Bussiness Intelligence Applications.

  • What is business intelligence ?
  • State Business Intelligence issues?
  • How Data Mining be used for BI Applications?
  • Explain Data Mining for Market Segmentation?
  • Explain Data Mining for Retail Industry?

Software Engineering viva questions

Hello guys, vivas are around and no clue what to study, we at conceptSimplified simplify your task.Below are few questions which we would recommend you to be prepared before giving your Vivas.

  • Name the software process models. Describe them.
  • What is process metrics and project metrics?
  • Name the software estimation techniques. Give the formula. Explain any one technique?
  • What are the principles of project planning?
  • Function Points & Lines of Code.
  • Define risk analysis. Steps of risk analysis.
  • What is RMMM(Risk Management, Mitigation & Monitoring) plan?
  • What are the principles of project scheduling?
  • What are milestones, WBS chart and Gantt chart?
  • What is software requirement? Requirements in engineering process.
  • Data flow diagrams. Characteristics of good SRS.
  • Software design abstraction and architecture.
  • Definition and types of cohesion and coupling.
  • What are the design issues?
  • Explain SCM process and tools.
  • What is Software quality assurance?
  • Explain Quality metrics ( Defect Removal Efficiency)?
  • What is baseline?
  • Different types of software testing.
  • Difference between blackbox testing and white box testing.
  • What is Software engineering?
  • What is Reverse engineering?
  • Service oriented software engineering.
  • What is alpha-beta testing?(Most asked question)
  • What is requirement analysis and specification?
  • What is use case model?
  • Test case and Test scenario.
  • Draw use case for library management, ATM.
  • What is version control?
  • What is change control?
  • What is requirement gathering?
  • What is FTR (Formal Technical Review)?
  • Explain the steps in FTR?
  • Define Software Engineering?
  • What are estimation techniques?
  • Explain design principles?
  • What is CoCoMo model?
  • What are testing principles?

Note: These are few of the most Frequently Asked Viva Questions. It would be best if you refer Software Engineering A Practitioner’s Approach 7th Edition – Roger Pressman for solutions of viva questions.

Persistent Placement Process

Persistent Placement Process-Recruitment Drive:

About Persistent:
Persistent is a place that values enrichment both on a professional and personal level. With this in mind, Life @ Persistent is dynamic. Life for employees is brimming with activities and there are groups for you to pursue your personal interests in areas such as arts, culture and sports.

Sports buffs can participate in sports events organized by the Persistent Sports Committee. If your interest lays more on the artistic side, you will have plenty of occasions to express your creative self through the Persistent Arts Circle, fondly coined as PAC. Other groups include the Welfare Activity Group (WAG), PRERANA that is a Women Professional support group, and Green Persistent that promotes ecological initiatives.

ALSO they provide many health benefits and other benefits to employees ,A doctor and a mental health counselor visit all of their campuses on a regular basis.Helpdesks are available to help resolve queries about banking, tax planning, insurance, etc.To encourage fitness, you will find on campus, a fully equipped gym and yoga room with a full time instructor.They also serve lunch and snacks for all employees.

Placement Process:
General outline of the Persistent Placement Process:(Only for IT/COMPS Candidates.)

  • Pre-Placement talk
  • Online Aptitude Test
  • Technical Interview I
  • Technical Interview II
  • HR and Management Interview

Pre-Placement talk:
Now coming to the placement process, it is slightly different from that of TCS. Firstly you would be subjected to a pre-placement talk. This would last for around for 30 to 45 minutes. Basically they would talk about their whole roots and origin, their current position in the industry and most importantly they would discuss about their working lifestyle. After this you would be subjected to an aptitude test. However one more thing to get clarified is that Persistent is not a bulk recruiter and usually takes 5 to 10% of candidates (for Navi Mumbai Region) and this was the case in my campus. And as of 2014-2015, Persistent was offering Rs 3.59 lakhs for new recruits in Mumbai region.
NOTE:The Placement Process for me was a Pooled Campus Recruitment Drive.It had 5 Major and reputed Colleges participating in the Drive.

Online Aptitude Test:
One thing which I found that in its aptitude test, Persistent even gives a programming question for a challenge. That is aptitude and programming questions were integrated. Even more interesting fact was that all those who cleared the aptitude were only those who could get through this programming challenge. The main portion of the test was related to Operating System Fundamentals, Data Structures-Mainly Tree and Linked List Concepts,C++ Syntax and Concepts.The Non Technical Part was mainly focused on Logical and Analytical Reasoning more than Numerical Data Relations or Mathematical Aptitude.There was a C++ code related to substring method of STRING Class and had a class making use of it, a method having string and count as parameters.Pseudo code was provided. This speaks volumes about Persistent’s need for Logically and Analytically Sound Selects.

TECHNICAL INTERVIEW I: (Duration One Hour and 15 Minutes)
All the candidates who have cleared aptitude cum programming challenge test are now subjected to TI-I.From the total of 150 odd students who gave aptitude test only 29 finally got selected to the TR-I. As mentioned above the duration might get you worried , once you are in the interview room the time moves fast.Be ready to be grilled in the technical interview of Persistent.I was asked in which language I was comfortable in, to which I replied Java.And the next moment I was asked questions related to C++. The question was to explain the block of Simple C++ Program(where the main method would lie, the classes would lie ,etc). Actually the basic concepts for C++ or Java are same ,so he might be checking my skills there.

Then slowly he moved on to Java and asked questions related to public static void main, why each one of them is necessary , and what if it is removed.Then the interviewer moved on to data structures ,Linked Lists, Stacks, Queues and many more.He asked real life examples for few of them.Then the binary trees and tree concepts in data structures.He didn’t move easily from one question to other , and it was quite a tedious session for me. But he was quite satisfied with most of my answers and I was able to clear this round. Do not worry unnecessarily be confident because confidence is what matters. I have seen many who have knowledge and good grip over programming language could not make it but guys who had confidence could clear this round.

This is the actual filtering interview.The interview started with me waiting One Hour outside the cubicle, completely tensed and was facing with my back towards the Interviewer. Only the glass partition and the ongoing interviewee were separating us.I during my long halt at the door had to rotate my neck nearly 360 DEGREES to get a glimpse of things inside.It was increasing my nervousness more as I could see the girl inside continuously writing or talking , and her body language was as if she ‘trying to convince,with all the greatness installed in her system’,and the interviewer just had a casual look.Now after that tense hour , finally the door opened but I was waiting outside, I had drank a few glasses of water during that hour and was very nervous only because i had just went through a tedious Ist TI.Soon a volunteer asked me to enter the room and things changed upside down.

All of the atmosphere was quite became same as that of Ist TI.And soon the Interviewer smiled and asked few personal questions and even I answered them with a happy face but with heavy heart.After that he went through my Resume and asked about my internship with a online engineering portal and things related to it. I was quite quick to answer them and he seemed interested.Soon he asked about Java and the famous static keyword.Later he asked about Java Technologies like Swings,AWT.Next Questions were on Servlets,JSP their differences and Netbeans IDE.He also asked about the MVC Model and explaination for the same.

It took lot of time in all these questions.Soon he leaned on the table after making an effort to bring his chair closer to the table and smiled.Now he told I am going to ask you few questions for which you have to provide not just an answer but support with an equally defining logic.This got me triggered and I smiled a bit and replied ,OK.Then he asked the first Puzzle type Ques of the Interview,”How many steps you walk daily right from your doorstep till the moment you step into the college classroom,you have exactly 10 minutes he said looking into his watch(it was already 5:34 in the evening)”.I answered it without using pen and paper and provided explanation for every step I mentioned.At the end he was convinced and we both laughed , as a result he got interested and was happy and said “I want to ask One MORE”.I said sure and he asked” Assume Rajasthan as the state of concern for us for the moment , and tell me how many Camels were drinking water today at 2pm “.I dont know how but I answered this as it came with no hiccups and he said “I m convinced”.Soon he asked few more personal questions and the TI-II was over.

HR & Management Interview:
It was 20 minutes after my TI-II and lasted for the same span of time that is 20 minutes.I entered in and there were two HR Managers waiting.One of them stood throughout the the interview the other sat casually laid back into his seat.I went inside soon he said to have a seat.He asked me the most GENERIC questions.

  • Tell me about Yourself.
  • Your Hobbies.
  • Your Interests and qualities that will benefit our Organization.

(In the meantime while I was busy answering, the TI-II interviewer stepped in and I greeted him with a smile).Later the standing HR Manager asked me question.
Suppose you are a Team Leader with a promotion in hand to be give away among Three of your Members.No 1 is punctual but does not provide 100% solution guarantee.No 2 is not punctual but when he is around job is Done.No 3 is Lazy but provides 100% solution guarantee, how will do the promotion thing?

I took approx 20 seconds and replied, “I will promote the lazy member on condition he leaves his laziness”
There was lot of cross-questioning on why not the No 1 or No 2 but I hold on to my answer and later he replied placing my Resume on the bunch of papers and said “SELECTED!!!”

I never relied on his comment but felt that I was through but after exactly 13 longgg days my college TPO got a mail with my name at no 3 and only 13 was the head count of Final Selects.


  • This was entirely based on my experience. And I no way assure that you would be undergoing through the same process. This was the procedure followed for 2014-2015 candidates.
  • Go through all the Data structure Basics and Tree basics and Java Fundamentals and you should make it.
    Be confident and make yourself be portrayed logical and sound.
  • Dont play skipping stones while answering “No Single Answer” question/puzzles.
  • Stick to one answer and provide a valid reason.
error: Content is protected !!