Group Assignment KR prolog

Group assignment
• The group project will be scored between 1 and 10.
• You do not need to pass the group project in order to pass the course.
• There will not be a second attempt opportunity for the group project.
• Deadline for the group project is 9.05.2023. at 21:00. (submitting via Canvas).
• Scores will be released on 23.05.2023.

• writing Prolog programs to answer on counterfactual explanations.
• The project consists of two parts:
1. designing a Prolog knowledge base given a raw file (.csv) containing symbolic terms and confidence values, and
2. implementing Prolog queries to answer counterfactual explanations for interesting scenarios.
Students must upload a single text file (.pl) containing:
• The Prolog knowledge base,
• The queries resolving the counterfactual explanation questions, and
• concise comments discussing relevant implementation details and the solutions provided by each query.
• the file should include the names of all group members as a comment.
• the program must compile, and the queries must run and provide correct answers. There could be different correct approaches to build the knowledge base, and all of them will be accepted as long as the queries provide correct answers.

Why this project?
• In this project, you will implement a (simplified) symbolic explanation module to answer counterfactual explanations using the Prolog programming language.
• These explanations will help determine whether individuals are discriminated against in decision-making processes.

Dataset description
• In this project, you will use the German credit dataset, which is used for classifying loan applicants at a bank as good or bad credit risks.
• There are about 100 applicants: 59 belonging to good class (“good_credit”), and 41 belonging to bad class (“bad_credit”)
• Applicants are described by 20 qualitative and quantitative features:
o status of existing checking account, o credit duration,
o credit history,
o purpose,
o credit amount,
o savings account/bonds,
o present employment since,
o installment rate in percentage of disposable income, o personal status and sex,
o other debtors/guarantors,
o present residence since, o property,
o age in years,
o other installment plans, o housing,
o number of existing credits at this bank,
o number of people being liable to provide maintenance for, and
o foreign worker.

Dataset description
German Credit Dataset
Abstract: This dataset classifies people described by a set of attributes (features) as good or bad credit risks. Below you can find more information about this dataset.
Data Set Multivariate Number of 100 Area: Financial Characteristics: Instances:
Attribute Categorical, Number of Characteristics: Integer Attributes:
20 Date Donated: 1994-11-17
Associated Tasks: Classification Missing Values? N/A Number of Web 765309 Hits:

Protected features
• Based on the literature, the features age, personal status and sex are considered protected features. Protected characteristics or features are specific aspects of a person’s identity defined by the Equality Act 2010.
• The protection aspect relates to protection from discrimination. It should also be mentioned that gen- der/female and age/young (younger than 25 years old) are the protected groups.
• Overall, when it comes to knowledge encoding, a protected group can be represented by a symbolic value for a protected feature.
程序代写 CS代考 加QQ: 749389476
Transformation: from numerical to symbolic
• In the dataset we use, the values are already transformed using Fuzzy logic theory in the following way:
• In the case of the feature credit duration, we will use the following symbolic terms: o very short-term (VS),
o short- term (S),
o medium-term (M),
o long-term (L), and very long-term (VL).
• In the case of present residence since, we will use the following symbolic terms: o long time ago (L),
o some time ago (S), o fairly recent (F),
o recently (R), and
o very recent (VR).
• For the numerical features representing quantities, we will use: o very low (VL),
o low (L),
o medium (M), o high (H) and
o very high (VH).

Attribute Information
Status of existing checking account
a11 : … < 0 DM a12 : 0 <= ... < 200 DM a13 : ... >= 200 DM / salary assignments for at least 1 year a14 : no checking account
Attribute 1: (qualitative)
Attribute 2: (numerical, to be transformed to symbolic)
Duration in month
Attribute 3: (qualitative)
Credit history
a30 : no credits taken/ all credits paid back duly
a31 : all credits at this bank paid back duly
a32 : existing credits paid back duly till now
a33 : delay in paying off in the past
a34 : critical account/ other credits existing (not at this bank)
Attribute 4: (qualitative)
a40 : car (new)
a41 : car (used)
a42 : furniture/equipment
a43 : radio/television
a44 : domestic appliances
a45 : repairs
a46 : education
a47 : (vacation – does not exist?) a48 : retraining
a49 : business
a410 : others

Attribute Information
Attribute 5: (numerical, to be transformed to symbolic)
Credit amount
Attibute 6: (qualitative)
Savings account/bonds a61 : … < 100 DM a62 : 100 <= ... < 500 DM a63 : 500 <= ... < 1000 DM a64 : .. >= 1000 DM
a65 : unknown/ no savings account
Attribute 7: (qualitative)
Present employment since a71 : unemployed
a72 : … < 1 year a73 : 1 <= ... < 4 years a74 : 4 <= ... < 7 years a75 : .. >= 7 years
Attribute 8: (numerical, to be transformed to symbolic)
Installment rate in percentage of disposable income

Attribute Information
Personal status and sex
a91 : male : divorced/separated
a92 : female : divorced/separated/married a93 : male : single
a94 : male : married/widowed
a95 : female : single
Attribute 10: (qualitative)
Other debtors / guarantors a101 : none
a102 : co-applicant
a103 : guarantor
Attribute 9: (qualitative)
Attribute 11: (numerical, to be transformed to symbolic)
Present residence since
Attribute 12: (qualitative)
a121 : real estate
a122 : if not A121 : building society savings agreement/ life insurance a123 : if not A121/A122 : car or other, not in attribute 6
a124 : unknown / no property

Attribute Information
Attribute 13: (originally numerical, now symbolic)
a131 : younger than 30 years old a132 : between 30 and 40
a133: between 41 and 52
a134 : older than 52 years old
Attribute 14: (qualitative)
Other installment plans a141 : bank
a142 : stores
a143 : none
Attribute 15: (qualitative)
a151 : rent a152 : own a153 : for free
Attribute 16: (numerical, to be transformed to symbolic)
Number of existing credits at this bank

Attribute Information
Attribute 17: (qualitative)
a171 : unemployed/ unskilled – non-resident a172 : unskilled – resident
a173 : skilled employee / official
a174 : management/ self-employed/
highly qualified employee/ officer
Attribute 18: (numerical, to be transformed to symbolic)
Number of people being liable to provide maintenance for
Attribute 19: (qualitative)
a191 : none
a192 : yes, registered under the customers name
Attribute 20: (qualitative)
foreign worker a201 : yes a202 : no
浙大学霸代写 加微信 cstutorcs
Dataset description
In the dataset, each row (or instance) represents a credit application, while columns enclose the symbolic terms and confidence values for each problem feature.
• Confidence values are in the [0, 1] interval and can be understood as a quality measure. In addition, each instance is labeled with a decision class (good or bad credit risk) and its corresponding confidence value.
• Finally, each instance is associated with a global confidence value that indicates how much that instance conflicts with others in the dataset.
• For example, a confidence value of one suggests that the instance does not conflict with any other instances in the dataset .

confidence values

Group Project Requirements
Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
• The knowledge base must be constructed in a way to support the counterfactual explanations for two scenarios concerning bias in the transformed German credit dataset.
Part 2a: Are women discriminated against?
Part 2b: Are young people discriminated against?

Counterfactual Explanations
• Describes a casual situation in the following form:
“If 𝑋 had not occurred, 𝑌 would not have occurred”.
For example: “If I had not taken a sip of this hot coffee, I would not have burned my tongue”. The event 𝑌 is that I burned my tongue, while the cause 𝑋 is that I had a hot coffee.
• Counterfactual thinking imagines a reality contradicting observed facts.
• Counterfactual explanations clarify individual predictions and investigate possible scenarios. • Event is predicted outcome, causes are input values affecting the prediction.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
Below are some examples for designing a knowledge base for symbolic data in the table given. The input predicate involves eight arguments and output one involves two arguments.
The arguments are described by the following symbolic terms: very low (vl), low (l), medium (m), high (h) and very high (vh). Please notice that these examples do not include confidence values.
The examples below allow answering the following question:
“Which value should have taken the input variable X such that the output variable Y can produce the symbolic value B instead of A?”

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
For example:
if we want to ask for input value “h” in the second place, for output(_,Y) we will get the result: Y = m.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
For example:
if we want to ask for output value “h” in the first place, for input value input(X,_,_,_,_,_,_,_), we will get the result: X = h, X = l, X = vl.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
For example:
if we want to ask for input value “h” in the first place, for output value Y we will get the result: Y = h.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
For example:
if we want to ask for input value “h” in the first place, for output value Y we will get the result: Y = h.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
For example:
if we want to ask for input value “h” in the first place, for output value Y we will get the result: Y = h.

Part 1: Build a Prolog knowledge base using the symbolic terms and confidence values describing the transformed German credit dataset.
Using the third variant to build the knowledge base, we can create one rule – for one instance in our dataset in following the third variant:
rule( Attribute Information with Confidence Values,good_credit,Confidence Value,Overall ). … bad_credit
To support counterfactual explanations, we need to create appropriate rules for the knowledge base

Part 2a: Are women discriminated against?
Write a Prolog query to determine if the female applicant was discriminated against due to her gender by the AI-powered decision system. In addition, you should determine all confidence values supporting the answer provided by the program.
First sensitive profile (female applicant):
A young female whose existing checking account at the bank is ‘<0DM‘, the credit duration is ‘very short‘, her credit history is labeled as ‘existing credits paid back duly until now‘, the purpose of the loan is to ‘buy a new car‘, the credit amount is encoded as ‘very low‘, the savings account/bonds is ‘<100DM‘, her present employment falls in the category ‘1 <=...< 4 years‘, the installment rate in percentage of disposable income is encoded as ‘very high‘, her marital status is ‘separated‘, she does not have ‘other debtors/guarantors‘, her present residence date is ‘fairly recent‘, she ‘owns a real state property, and she does not have other installment plans. Moreover, the applicant is the ‘owner‘ of her house, her number of existing credits at this bank is labeled as ‘very low‘, she is an ‘unskilled resident‘, the number of people being liable to provide maintenance for is labeled as ‘very low‘ by the bank, she does not have a telephone, and she is a ‘foreign worker‘. Part 2a: Are women discriminated against? Write a Prolog query to determine if the female applicant was discriminated against due to her gender by the AI- powered decision system. In addition, you should determine all confidence values supporting the answer provided by the program. First sensitive profile (female applicant): A young female whose existing checking account at the bank is ‘<0DM‘, the credit duration is ‘very short‘, her credit history is labeled as ‘existing credits paid back duly until now‘, the purpose of the loan is to ‘buy a new car‘, the credit amount is encoded as ‘very low‘, Go to Attribute Information and check for each row here where belongs Part 2a: Are women discriminated against? Write a Prolog query to determine if the female applicant was discriminated against due to her gender by the AI- powered decision system. In addition, you should determine all confidence values supporting the answer provided by the program. First sensitive profile (female applicant): the savings account/bonds is ‘<100DM‘, her present employment falls in the category ‘1 <=...< 4 years‘, the installment rate in percentage of disposable income is encoded as ‘very high‘, her marital status is ‘separated‘, she does not have ‘other debtors/guarantors‘, Go to Attribute Information and check for each row here where belongs Part 2a: Are women discriminated against? Write a Prolog query to determine if the female applicant was discriminated against due to her gender by the AI- powered decision system. In addition, you should determine all confidence values supporting the answer provided by the program. First sensitive profile (female applicant): her present residence date is ‘fairly recent‘, she ‘owns a real state property, and she does not have other installment plans. Moreover, the applicant is the ‘owner‘ of her house, her number of existing credits at this bank is labeled as ‘very low‘, Go to Attribute Information and check for each row here where belongs Part 2a: Are women discriminated against? Write a Prolog query to determine if the female applicant was discriminated against due to her gender by the AI-powered decision system. In addition, you should determine all confidence values supporting the answer provided by the program. First sensitive profile (female applicant): she is an ‘unskilled resident‘, the number of people being liable to provide maintenance for is labeled as ‘very low‘ by the bank, she does not have a telephone, and she is a ‘foreign worker‘. + outcome: good/bad_credit,ConfCredit,ConfOverall Go to Attribute Information and check for each row here where belongs Part 2b: Are young people discriminated against? Write a Prolog query to determine if the applicant was discriminated against due to his young age. In addition, you should determine all confidence values supporting the answer provided by the Prolog program. Second sensitive profile (young applicant): A young male whose existing checking account at the bank is ‘>=200DM‘, the credit duration is ‘long‘,
his credit history is labeled as ‘existing credits paid back duly until now‘, the purpose of the loan is to ‘buy a radio/television‘,
the credit amount is encoded as ‘medium‘,
Go to Attribute Information and check for each row here where belongs

Part 2b: Are young people discriminated against?
Write a Prolog query to determine if the applicant was discriminated against due to his young age. In addition,
you should determine all confidence values supporting the answer provided by the Prolog program.
Second sensitive profile (young applicant):
the savings account/bonds is ‘<100DM‘, his present employment duration falls in the category ‘1 <=...< 4 years‘, the installment rate in percentage of disposable income is encoded as ‘very high‘, his marital status is ‘single‘, Go to Attribute Information and check for each row here where belongs Part 2b: Are young people discriminated against? Write a Prolog query to determine if the applicant was discriminated against due to his young age. In addition, you should determine all confidence values supporting the answer provided by the Prolog program. Second sensitive profile (young applicant): he does not have ‘other debtors/guarantors‘, his present residence date is ‘fairly recent‘, he ‘owns a car‘, it is young person and he does not have other installment plans. Moreover, the applicant is the ‘owner‘ of his house, Go to Attribute Information and check for each row here where belongs Programming Help
Part 2b: Are young people discriminated against?
Write a Prolog query to determine if the applicant was discriminated against due to his young age. In addition, you
should determine all confidence values supporting the answer provided by the Prolog program.
Second sensitive profile (young applicant):
his number of existing credits at this bank is labeled as ‘very low‘, he is a ‘skilled employee/official‘,
the number of people being liable to provide maintenance for is labeled as ‘very low‘,
he does not have a registered telephone, and he is a ‘foreign worker‘.
+ outcome: good/bad_credit,ConfCredit,ConfOverall
Go to Attribute Information and check for each row here where belongs

And more …
Try to find and draw conclusions like:
What is happening when we have identical attributes?
Is there a connection between attributes: does attribute1 correlates with attribute2 for the final outcome to be a good_credit?

Assessment:

Good luck!