BI Hackathon 1 CHOOLS No Comment 13Mar Share Welcome to Naan Mudhalvan TNSDC - CHOOLS Business Intelligence and Analytics - Hackathon Level 2 (YOU HAVE ONLY 60 MINUTES TO COMPLETE THIS PART [ 75 Questions]) **** Wishing You All The Best **** Instructions : To start the quiz, please provide your name, email id and phone number. வினாடி வினாவைத் தொடங்க, தயவுசெய்து உங்கள் பெயர், மின்னஞ்சல் ஐடி மற்றும் தொலைபேசி எண்ணை வழங்கவும். All questions are mandatory* அனைத்து கேள்விகளும் கட்டாயமாகும். Answering each question is required before going on to the next page of questions. கேள்விகளின் அடுத்த பக்கத்திற்குச் செல்வதற்கு முன்பு ஒவ்வொரு கேள்விக்கும் பதிலளிக்க வேண்டும். Your response is subject to change at any time. உங்கள் பதில் எந்த நேரத்திலும் மாற்றத்திற்கு உட்பட்டது. After completed the quiz click the submit button. If, once the time is up, it will be automatically submitted. வினாடி வினா முடிந்ததும் சமர்ப்பி பொத்தானை சொடுக்கவும். நேரம் முடிந்ததும், அது தானாகவே சமர்ப்பிக்கப்படும். What is the primary goal of Exploratory Data Analysis (EDA)? To clean data To summarize the main characteristics of data To create machine learning models To encrypt datasets Which of the following is NOT a data visualization technique in EDA? Box Plo ) Histogram Decision Tree Scatter Plot What does a box plot primarily show? Mean, median, and variance Minimum, maximum, quartiles, and outliers ) Correlation between two variables The relationship between categorical and numerical data Which EDA technique helps identify correlation between variables? Heatmap ) Histogram Pie Chart ) Bar Chart What does a histogram represent? Frequency distribution of numerical data Relationship between two categorical variables The count of missing values The median of a dataset What is an outlier in EDA? A missing value A data point significantly different from others The mean of a dataset A categorical variable Which metric is used to measure central tendency? Mean Standard Deviation Range Outlier What does a scatter plot show? Distribution of a single variable Relationship between two numerical variables Outliers only The spread of categorical data What does the interquartile range (IQR) help in detecting? Missing values Data distribution Outliers The correlation between two variables What is the variance of a dataset? A measure of central tendency The spread of data around the mean The relationship between two variables The frequency of categorical values What is the primary goal of predictive analytics? Understanding past trends Predicting future outcomes Cleaning raw data Organizing datasets Which of the following is a supervised learning algorithm? K-Means Clustering Linear Regression Apriori Algorithm Principal Component Analysis What type of model is used in classification tasks? Linear Regression ) Decision Tree K-Means Clustering ) PCA Which algorithm is commonly used for predicting continuous values? Logistic Regression Random Forest Linear Regression Naïve Bayes What is overfitting in predictive models? When a model performs well on training data but poorly on new data When a model does not learn from data When a model ignores outliers When a model performs equally on all datasets Which evaluation metric is used for classification models? Mean Squared Error (MSE) Accuracy R-square Sum of Square What does precision measure in classification? The percentage of correctly predicted positive cases The percentage of correctly predicted negative case The total number of predictions made The error rate of a mode Which metric is best for imbalanced classification problems? Accuracy F1-score Mean Absolute Error R-Squared What is the purpose of train-test split? To improve model accuracy ) To evaluate model performance on unseen data To remove missing values To increase the size of the dataset Which algorithm is used for time series forecasting? KNN ARIMA Naïve Bayes Decision Tree What is the purpose of feature engineering in predictive analytics? ) Transforming raw data into useful inputs for models Removing duplicate records Normalizing data Splitting data into train and test sets What is cross-validation used for? To evaluate model performance To clean missing value To visualize data To train the model on a single dataset What does hyperparameter tuning help achieve? Optimizing model performance Removing noisy data Scaling numerical values Reducing dataset size What is a confusion matrix? A table showing correct and incorrect predictions ) A type of regression model A data visualization method A clustering technique What is a ROC curve used for? Evaluating classification model performance Identifying missing values Measuring regression accuracy Handling categorical data What is the purpose of data normalization in EDA? To remove duplicates To scale numerical values for better comparison To convert categorical data into numerical form To increase the number of variables Which visualization method is best for showing the distribution of a single numerical variable? Bar Chart Histogram Line Chart Heatmap What is the mean absolute deviation (MAD) used for? Measuring the spread of data Finding the correlation between two variables Identifying categorical variables Checking for missing values What does a correlation coefficient measure? Strength and direction of the relationship between two variables The frequency of categorical data The sum of squared errors in regression The missing values in a dataset In a positively skewed distribution, the tail is on which side? ) Left Right Center ) Both sides Which of the following is a measure of spread or dispersion in data? Median Standard Deviation Mode Percentil What does a Pareto Chart help identify? The most significant factors in a dataset The correlation between variables The median of a dataset The normality of data Which of the following is NOT a measure of central tendency? Mean Mode Standard Deviation Median In missing value imputation, what is the common approach for numerical data? Replacing missing values with the median or mean Deleting the entire dataset Converting numerical data to categorica Removing all duplicate values What is the purpose of dimensionality reduction in EDA? To reduce the number of irrelevant or redundant variables To remove missing values To increase model complexity To increase dataset size Which technique is commonly used for reducing overfitting in predictive models? ) Regularization ) Normalization Encoding Standardization What is the main difference between classification and regression? Classification predicts continuous values, regression predicts discrete values Classification predicts categorical labels, regression predicts continuous values Classification is used for time series data only Regression is only used for clustering problems Which of the following is an example of a classification problem? ) Predicting the price of a house ) Predicting whether an email is spam or not Forecasting stock prices Predicting temperature trends What is feature selection in machine learning? Selecting the most relevant features for better model performance Removing all outliers from the dataset Generating new features from existing ones Encoding categorical variables Which technique is used to handle imbalanced datasets? SMOTE (Synthetic Minority Over-sampling Technique) One-hot encoding ) Standardization Principal Component Analys What is the purpose of hyperparameter tuning? To optimize model performance by adjusting algorithm parameters To remove missing values from a dataset To transform categorical variables into numerical values To split data into training and testing sets What is the bias-variance tradeoff in predictive modeling? The balance between model simplicity and complexity The error caused by incorrect data labeling The tradeoff between classification and regression tasks The process of hyperparameter tuning Which machine learning technique is used for predicting categorical outcomes? ) Linear Regression Logistic Regression K-Means Clustering ) Principal Component Analysis What does RMSE (Root Mean Squared Error) measure? The average squared difference between actual and predicted values The percentage of correctly classified cases The mean of a dataset The total number of predictions made Which of the following algorithms is best suited for time series forecasting? Decision Tree ARIMA K-Means Clustering Random Forest What is the main advantage of using ensemble models? Improved model accuracy by combining multiple weak model Increased dataset size Faster training times Reduced computational complexity Which of the following is NOT an ensemble learning technique? Bagging Boosting K-Means d) Stacking Which technique helps prevent overfitting in decision trees? Pruning One-hot encoding Feature scaling Dimensionality reduction What is the primary use of Principal Component Analysis (PCA)? Reducing the number of variables while retaining important information Handling missing values Normalizing categorical variables ) Increasing dataset complexity Which model evaluation metric is used for imbalanced datasets? ) Precision-Recall Curve Mean Squared Error R-Squared Adjusted R-Squared Which of the following is NOT a goal of Exploratory Data Analysis (EDA)? Identifying patterns and trends Cleaning and preprocessing data Building predictive models Understanding relationships between variables What does a box plot NOT show? Mean Median Outliers Interquartile range Which measure is least affected by outliers? Mean Median Standard Deviation Variance What does a low p-value in a statistical test indicate? Strong evidence against the null hypothesis Weak evidence against the null hypothesis No relationship between variables High variance in the data Which method is best suited for detecting multicollinearity? Correlation Matrix Variance Inflation Factor (VIF) Histogram ANOVA Which of the following distributions is not symmetric? Normal Distribution Uniform Distribution Uniform Distributio Binomial Distribution Most values are concentrated on the lower end Most values are concentrated on the lower end Most values are concentrated on the higher end A normal distribution An equal spread of values Which transformation is used to handle right-skewed data? Square root transformation Log transformation Reciprocal transformation Reciprocal transformation Which test is commonly used to check the normality of data? Shapiro-Wilk test ANOVA Chi-square test Kolmogorov-Smirnov test What does a correlation coefficient of -0.85 imply? Strong positive correlation ) Strong negative correlation Weak correlation ) No correlation What is the primary goal of dimensionality reduction in EDA? To increase model accuracy To remove redundant features To increase the number of features To handle missing values Which algorithm is commonly used for dimensionality reduction? PCA (Principal Component Analysis) K-Means Decision Trees Logistic Regression Which method is most appropriate for dealing with missing values in categorical data? Mean imputation Mode imputation Median imputation Deleting missing values What is an interaction effect in EDA? When one variable affects another variable When two variables combined have a different impact than individually When a variable has a high variance When a dataset has missing values Which visualization is best for checking outliers? Scatter plot Box plot Line chart Bar char What is the purpose of feature scaling? To make features comparable To remove outliers To increase model accuracy To add new features Which correlation coefficient value indicates no relationship between two variables? 0 1 -1 0.5 Which algorithm is best for a binary classification problem? Logistic Regression Linear Regression K-Means Clustering PCA What is overfitting? When a model performs well on training data but poorly on test data When a model underperforms on both training and test data When a model has too few parameters When a model has a high bias What is the purpose of regularization in predictive modeling? To reduce overfitting ) To increase training accuracy ) To remove missing values To handle categorical variables Which metric is NOT used for classification models? Accuracy Precision R-Squared Recall Which technique is used to handle imbalanced datasets? SMOTE (Synthetic Minority Over-sampling Technique) PCA Standardization Mean Imputation What does ROC curve measure? Model’s ability to differentiate between classes Model’s accuracy on test data Mean squared error Training loss Which loss function is commonly used for classification models? Cross-entropy loss Mean squared error Huber loss R-Squared What is the purpose of a confusion matrix? To evaluate model performance in classification tasks To detect outliers To reduce dataset dimensionality To standardize variables Which of the following is NOT a goal of Exploratory Data Analysis (EDA)? ) Identifying patterns and trends Cleaning and preprocessing data ) Building predictive models Understanding relationships between variables What does a box plot NOT show? Mean Median Outliers Interquartile range Which measure is least affected by outliers? Mean Median Standard Deviation Variance What does a low p-value in a statistical test indicate? Strong evidence against the null hypothesis Weak evidence against the null hypothesis No relationship between variables No relationship between variables Which method is best suited for detecting multicollinearity? Correlation Matrix Variance Inflation Factor (VIF) Histogram ANOVA Which of the following distributions is not symmetric? Normal Distribution Uniform Distribution Exponential Distribution Binomial Distribution What does a right-skewed histogram indicate? Most values are concentrated on the lower end Most values are concentrated on the higher end A normal distribution An equal spread of values Which transformation is used to handle right-skewed data? Square root transformation Log transformation Exponential transformation Reciprocal transformation Which test is commonly used to check the normality of data? Shapiro-Wilk test ANOVA Chi-square test Kolmogorov-Smirnov test What does a correlation coefficient of -0.85 imply? Strong positive correlation Strong negative correlation Weak correlation No correlation What is the primary goal of dimensionality reduction in EDA? To increase model accuracy To remove redundant features ) To increase the number of features To handle missing values Which algorithm is commonly used for dimensionality reduction? PCA (Principal Component Analysis) K-Means Decision Trees Logistic Regression Which method is most appropriate for dealing with missing values in categorical data? Mean imputation Mode imputation Median imputation Deleting missing values What is an interaction effect in EDA? When one variable affects another variable When two variables combined have a different impact than individually When a variable has a high variance When a dataset has missing value Which visualization is best for checking outliers? Scatter plot Box plot Line chart Bar chart Which algorithm is best for a binary classification problem? Logistic Regression Linear Regression K-Means Clustering PCA What is overfitting? When a model performs well on training data but poorly on test data When a model underperforms on both training and test data When a model has too few parameters When a model has a high bias What is the purpose of regularization in predictive modeling? To reduce overfitting To increase training accuracy To remove missing values To handle categorical variables Which metric is NOT used for classification models? Accuracy Precision R-Squared Recal Which technique is used to handle imbalanced datasets? SMOTE (Synthetic Minority Over-sampling Technique PCA Standardization Mean Imputation What does ROC curve measure? Model’s ability to differentiate between classes Model’s accuracy on test data Mean squared error Training loss Which loss function is commonly used for classification models? Cross-entropy loss Mean squared error Huber loss R-Squared What is the purpose of a confusion matrix? To evaluate model performance in classification tasks To detect outliers To reduce dataset dimensionality To standardize variables Which algorithm is best suited for time-series forecasting? ARIMA K-Means Decision Trees Logistic Regression Which metric is commonly used for regression models? RMSE (Root Mean Squared Error) F1-score Precision Sensitivity What is the primary goal of a Business Intelligence (BI) strategy? Increase data storage Improve decision-making Reduce employee count Minimize hardware costs Which of the following is the first step in developing a BI strategy? Data visualization Data warehousing Understanding business objectives Selecting BI tools BI strategy should be aligned with: IT infrastructure only Business goals and objectives Marketing strategies Customer feedback alone A key component of a BI strategy includes: Randomized data collection Data integration and governance Ignoring real-time reporting Restricting user access to dashboards Which department is primarily responsible for driving BI strategy? IT Finance Human Resources Marketing The effectiveness of a BI strategy is measured by: The number of reports generated Business impact and insights derived Total data stored BI tool cost savings What is a common challenge in BI strategy implementation? Data silos Increased storage capacity Lack of internet access Small datasets A strong BI strategy helps an organization to: Increase operational efficiency Reduce decision-making time Enhance data-driven culture All of the above Which of the following is NOT a BI strategy best practice? Involving stakeholders early Defining KPIs Keeping BI tools restricted to IT teams only Ensuring data governance Answer: BI maturity models help organizations: Measure BI adoption levels Increase storage costs Avoid data analysis Replace all reports with dashboards What is the first phase of BI implementation? Data collection Planning and requirement analysis Reporting Dashboard creation Answer: Which of the following is a key factor in BI implementation success? User adoption Only using open-source tools Ignoring data security Manual data processing ETL stands for: Extract, Transform, Load Evaluate, Transfer, Learn Export, Track, Link Encrypt, Transfer, Load The role of ETL in BI implementation is to Extract data from various sources Transform raw data into a useful format Load data into a data warehouse All of the above A Data Warehouse is used in BI for: Storing and integrating data from multiple sources Only storing transactional data Real-time analysis only Ignoring historical data What is an important feature of BI dashboards? Interactive visualization Static, non-interactive graphs Limited user access Only numerical reports What is the primary benefit of self-service BI? Reduces IT dependency Increases data complexity Slows down reporting Requires programming knowledge Which is NOT a BI implementation challenge? Poor data quality Resistance to change Clear business objectives High implementation costs Which of the following BI tools is widely used? Tableau Power BI Qlik Sense All of the above Data Lakes are used in BI to: Store structured and unstructured data Process real-time analytics only Eliminate data warehouses Reduce reporting accuracy BI governance focuses on: Data security and compliance Disorganized data access Avoiding user authentication Restricting reports to only IT teams .Data governance ensures Data accuracy and consistency Random access to data Poor data quality Unauthorized user access A BI governance framework includes: Data access policies Security protocols Compliance regulations All of the above Data quality management in BI governance focuses on: Data accuracy Data completeness Data consistency All of the above Who is responsible for BI governance? Chief Data Officer (CDO) Business users only IT department only No one, it happens automatically What is the main objective of BI governance? Ensure secure and controlled access to BI data Increase data silos Reduce data usage Make BI reports manually Role-based access control in BI governance means Users have access based on their job roles Everyone gets full access Data is not protected Only IT has access to all data .BI governance policies should be: Regularly updated Ignored after implementation Managed without a framework Only relevant to large organizations Which regulation is important in BI governance? GDPR PCI-DSS HIPAA All of the above BI governance improves: Data security and compliance Manual reporting Data loss Unstructured decision-making Which of the following is a key characteristic of a successful BI strategy? Ad-hoc reporting Data-driven decision-making ) Only using historical data Ignoring real-time analytics BI strategy should be reviewed: Once and never updated Annually or as needed Only when issues arise Every 10 years Which metric is commonly used to measure BI success? Revenue growth Report count Number of dashboards created Data storage capacity The primary purpose of BI KPIs is to Track business performance Increase data complexity Reduce data accuracy Limit reporting capabilities .BI strategy should support: Only executive decision-making Both operational and strategic decisions Data collection only Eliminating historical data Which of the following is NOT a BI implementation best practice? Involving key stakeholders Continuous training and support Implementing without a clear roadmap Ensuring data security Which type of database is commonly used for BI? Relational databases Data warehouses NoSQL databases All of the above What is the role of OLAP in BI? Enables multi-dimensional analysis Stores unstructured data Ignores data aggregation Slows down reporting Which tool is commonly used for ETL processes? Talend Informatica Apache NiFi All of the above Which of the following is an example of unstructured data used in BI? Social media posts Sales reports Customer transaction records Employee salary data Which cloud platforms are commonly used for BI solutions? AWS Microsoft Azure Google Cloud Platform All of the above What is an important feature of real-time BI? Instant data updates Only uses batch processing Limited to historical analysis Requires no data integration BI implementation success depends on: Strong data governance )User training and adoption High-quality data All of the above What is data lineage in BI governance? Tracking data origin and transformations Deleting historical data Restricting data access Ignoring data security policies BI governance policies help organizations comply with: GDPR HIPAA SOX All of the above What is the role of metadata in BI governance? Provides data definitions and descriptions Reduces data storage Deletes duplicate data Hides data from users A strong data governance policy improves: Data quality and consistency ) Data silos Unauthorized access Inconsistent reports Which tool is commonly used for BI governance? Collibra Informatica Data Governance Alteryx Both a and b BI governance should define: Data ownership and accountability Only technical rules Access for IT teams only One-time policies with no updates What is the purpose of data access controls in BI governance? Prevent unauthorized data access Allow open access to all users Reduce compliance regulations Remove encryption from sensitive data What is the primary goal of a Business Intelligence (BI) strategy? Increase data storage capacity Improve decision-making through data-driven insights Replace manual reporting with spreadsheets Reduce IT infrastructure costs Which of the following is NOT a key component of a BI strategy? Data governance Predictive analytics Supply chain management Business user training What is the first step in developing a BI strategy? Selecting a BI tool Defining business objectives Implementing dashboards Creating a data warehouse A well-defined BI strategy ensures: IT departments control all data Business users have access to relevant insights Data is stored in multiple locations BI tools replace traditional databases Which of the following is a key benefit of a self-service BI strategy? Reducing reliance on IT teams Eliminating the need for data governance Ensuring all reports are manually created Increasing data complexity Which BI strategy focuses on embedding analytics into business applications? Standalone BI Embedded BI Self-service BI Cloud-based BI Which business function benefits most from real-time BI strategy? Human Resources Marketing Campaigns IT Support Historical Reporting Which type of analytics is the primary focus of BI strategy? Descriptive analytics Diagnostic analytics Predictive analytics Prescriptive analytics What is a key challenge when implementing a BI strategy? Data integration from multiple sources Reducing data storage costs Eliminating spreadsheets Restricting access to repors A centralized BI strategy ensures Each department uses separate BI tools Data is managed through a single source of truth c) Business users independently manage data storage BI reports are manually created What does a BI roadmap typically include? BI vision, milestones, and implementation plan A list of competitors Only IT infrastructure details A single BI tool selection The success of a BI strategy is measured by ) Number of reports generated Data storage capacity Business impact and ROI The complexity of BI tools used Which technology enhances BI strategy by handling unstructured data? Data warehouses NoSQL databases Spreadsheets Flat files The BI strategy of a retail business should prioritize: Real-time inventory analytics Employee payroll processing IT help desk automation Network security monitoring What is the role of data democratization in a BI strategy? Restricting BI access to IT teams Enabling all employees to access data for decision-making Storing data in isolated silos Reducing the use of dashboards Which business process is most likely to benefit from predictive BI strategies? Sales forecasting Data entry tasks Physical document management Corporate budgeting .What is the main advantage of cloud-based BI strategies? On-premise data control Increased hardware maintenance Scalability and flexibility Limited data accessibility A hybrid BI strategy combines: On-premise and cloud-based BI solutions Manual and automated reports Data silos and centralized databases Legacy BI tools only Which methodology is commonly used in BI project implementation? Agile methodology Waterfall methodology Lean Six Sigma All of the above What is the biggest risk of not having a BI strategy? Increased report generation Data-driven decision-making Poor data quality and inconsistent insights Reduced IT expenses The key objective of BI implementation is: Delivering real-time insights Storing as much data as possible Replacing all human decisions with AI Creating a fixed reporting structure What is the first step in a BI implementation plan? Identifying business objectives Selecting a BI tool Deploying reports Conducting user training ETL stands for: Extract, Transform, Load Enterprise Technology Lifecycle External Transaction Ledger Evaluate, Test, Launch BI implementation is most effective when: Users receive proper training Data quality is ignored Reports are static Only IT teams use BI The biggest challenge in BI implementation is: Data integration from multiple sources Choosing the most expensive BI tool Avoiding data security measures Ignoring user feedback What is the purpose of BI governance? Ensuring data security, compliance, and quality Restricting BI access to IT teams only Eliminating self-service analytics Increasing data storage costs BI governance frameworks should include: Data ownership policies Compliance with regulations Security and access controls All of the above Which compliance regulations impact BI governance? GDPR HIPAA SOX All of the above Data lineage helps with: Tracking data flow and transformations Deleting old reports Storing only structured data Reducing dashboard complexity Which of the following is a critical success factor for a BI strategy? Data redundancy Business alignment IT department control Large data storage capacity What is the most important characteristic of a good BI strategy? Supports business objectives Requires frequent tool upgrades Focuses only on past data Eliminates the need for data governance A BI strategy should be: Rigid and unchangeable Aligned with organizational goals Focused on IT infrastructure Independent of data governance A well-designed BI strategy ensures Only IT teams access data Business users receive actionable insights Reports are always static Data security is ignored Which KPI is used to measure BI success? Number of dashboards created Business process improvement Volume of data stored Number of data sources What is the main focus of a BI strategy roadmap? Aligning BI initiatives with business needs Replacing all manual processes ) Choosing only cloud-based tools Reducing IT staff What is the advantage of a decentralized BI strategy? Faster decision-making at department levels Increased data silos Less control over data access Higher dependency on IT teams What should a BI strategy prioritize for a retail business? Customer segmentation and demand forecasting Employee payroll management Office space optimization Software licensing costs A data-driven culture in a BI strategy ensures: Employees use data for decision-making IT teams handle all reporting Only structured data is used Business users avoid BI tools What is the benefit of BI standardization in an enterprise? Consistent reporting and data accuracy More fragmented data sources Increased shadow IT practices Reduced BI adoption rates Which BI maturity model stage focuses on automated analytics? Reactive Proactive Predictive Prescriptive The main goal of BI governance in strategy is to: Ensure data accuracy, security, and compliance Increase manual reporting efforts Provide BI access only to senior management Ignore industry regulations What role does data visualization play in a BI strategy? Enhances data interpretation for decision-makers Replaces the need for raw data analysis Stores data efficiently Hides complex trends Which BI strategy is best for real-time decision-making? Streaming analytics Batch processing Historical reporting Manual data entry What is the purpose of BI user adoption training? Ensuring users understand BI tools and reports Limiting BI access to technical staff Reducing the number of BI users Increasing IT team workload A hybrid BI approach is beneficial because: It combines on-premise and cloud-based solutions It eliminates the need for a data warehouse It limits BI access It focuses only on structured data Which business function benefits most from self-service BI? Marketing IT Support HR Network Security What is a common risk in BI strategy implementation? Poor data governance High data accuracy Standardized reporting Clear data ownership policies BI competency centers (BICC) are responsible for: Establishing best practices for BI adoption Managing IT infrastructure only Eliminating BI governance policies Restricting BI usage Data silos negatively impact BI strategy because they: Prevent data integration and hinder insights Increase data accessibility Improve cross-department collaboration Enhance self-service BI Key phase in BI implementation: Data integration Employee onboarding Business process re-engineering Employee payroll processing Time is Up! Previous BPM Hackathon 2 March 11, 2025 Next BI Hackathon 1 March 13, 2025 You Might Also Like Hello world CHOOLS No Comment How to Disable Avast Antivirus CHOOLS No Comment AVG Review — Is the Absolutely free Version As effective as the Premium Version? CHOOLS No Comment Understanding the Limitations of Models of Managing CHOOLS No Comment Careers Similar To Teaching CHOOLS No Comment